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THIS ISSUE BINDER IS INTENDED TO PROVIDE A BASIC, 
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REGARDING A SPECIFIC TOPIC ON ETS AND THE HEALTH OF 
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STATISTICS AND EPIDEMIOLOGY 


"Nevertheless, in a real sense, statistics is 
the study of populations, or aggregates of 
individuals, rather than of individuals. 
Scientific theories which involve the properties 
of large aggregates of individuals, and not 
necessarily the properties of the individuals 
themselves . . . are essentially statistical 
arguments, and are liable to misinterpretations 
as soon as the statistical nature of the 
argument is lost sight of." 


Sir Ronald Fisher 
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Statistical significance and confidence intervals 

m yf.nr paper* in <h. Journal u» wh« may be reasonably inferred, given that *1 »«* KB 

/\/| »(•((•( icai method* and one of th« * dirfefem sample would have produced a If the data differ markedly from those wnicn 
X T X aims of the review process U to cry different result. would be expected under the null hypothesis* 


'any papers in the Journal use 
statistical methods and one of the 
» aims of the review process is to try 
to ensure that appropriate methods have 
been used. Often papers report results of 
comparative studies (hat are designed to 
answer questions such as whether one 
treatment is superior to another for a 
particular disease, or whether there is an 
association between some form of behaviour 
(for example, taking regular exercise or 
smoking) and the occurrence of some 
disease. Comparative studies are almost 
invariably carried out on a sample of 
individuals who are chosen from the 
population of individuals to whom it is 
intended to generalize the results. Data are 
collected on the sampte in order to make 
inferences on the population. Valid 
inferences can only be drawn if the sample 
is chosen in such a way that it is represen¬ 
tative of the population. Otherwise a bias 
could occur; epidemiological methods are 
designed to eliminate such biases. 

Since the aim of a statistical analysis is to 
make inferences, it is paramount to express 
whatever inferences that can be drawn in the 
most informative way. There are several 
methods of statistical inference, but the two 
that are most commonly used are 
significance testing and confidence interval 
estimation. The former is well known and 
is featured by quoting P values. Many 
authors appear to be under the impression 
that a profusion of P values is necessary; 
regrettably this impression has been bolstered 
in the past by editors of biological journals. 
Significance testing has its place but. as 
mentioned by Healy in 1978/ "it is widely 
agreed among statisticians (if less so among 
the more naive users of statistics) that 
significance testing is not the be-all and end- 
all of the subject". In this leading article I 
would like to discuss the characteristics of 
both methods of inference, show that a 
confidence interval contains the result of a 
significance test, but not vice versa, and 
suggest that confidence intervals are the 
answers to the more interesting questions 
that data can be used to answer. 

Any particular study is based on a 
particular sample; however, it is useful to 
imagine that the study is repeated with a 
different sample being selected each time. 
These hypothetical studies will give different 
results because they contain different 
individuals, and individuals vary in any 
characteristic because of biological varia¬ 
bility. The differences are termed sampling 
variability. It follows then that the results 
that are obtained from a particular sample 
can only be taken as an approximation to the 
actual situation in the whole population. 
Statistical methods are concerned *i>h 
assessing the degree of approximation and 


The methods arc based on the assumption 
that it is a matter of chance which particular 
subjects are in the sample that is being 
studied, and the sampling variability is thus 
random variation which is determined by the 
laws of probability. Therefore, the inferences 
are expressed in terms of probability. The 
situation is illustrated below. 

Population 


i 


sampling variation 


Sample data 


uncartainty 


Inferences on population 

Taking a sample from (he population 
involves sampling variation. As a conse¬ 
quence of this, inferences from the sampte 
data back to the population invotve 
uncertainty. 

A statistical analysis may be thought of as 
asking questions of the data. In an invest!* 
gat ion that compares two groups for the 
mean value of. for example, blood pressure 
or the prevalence of some disease, three 
questions may be posed: js there a difference 
between the groups?; How large is the 
difference?: andfriow accurately is the si2c 
of the difference known?. 

As expressed, the first question expects the 
answer “yes" or “no"; although the answer 
cannot be given in precisely these terms, it 
is often reduced to two possibilities. The 
appropriate methodology is th e sientftcance i 
test . The second question expects a numerical 
value to be the answer. This is an estimate 
and, as it is a single value, is referred to as 
a point estimate. In effect, the third question 
asks how reliable this point estimate is; the 
answer is a range of values which is referred 
to as an interval estimate or a confidence 
mterval, _ 

These questions represent two approaches 
to inference; hypothesis testing and 
estimation. Although at first sight they 
appear to be quite different, in concept they 
have much in common. Both make 
inferential statements about the value of a 
parameter. (A parameter is an unknown 
quantity which partly or wholly characterizes 
a population, for example, a mean or a 
measure of association.) 

The significance test is an appropriate 
technique when there is an a priori hypothesis 
To test. For the purpose of the statistical test 
This hypothesis is expressed in nu/F form ~ 
such as when no difference exists between 
groups — and the test evaluates whetherTHc 


to the extent that the probability of such an 
extreme result is low, then it is said that the 
result is statistically significant. Probability 
is measured on a continuum between 0 and 
1, but in significance testing a probability is 
considered low if it is less than conventional 
values such as 0.05 (5*) or 0.01 (1%). A 
significant result is equated with the rejection 
of the null hypothesis or the claim of a real 
effect. By definition, when the null 
hypothesis is true, significant results will 
occur by chance with the same relative 
frequency as the significance probability. 
That is, real effects will be claimed when the 
null hypothesis is true; however, the proba¬ 
bility of this etTor (type 1) is determined in 
the data analysis. 

One disadvantage of a significance test is 
that it may fail to detect a real effect; that 
is* although the nuB hypothesis is false, the 
evidence is not strong enough to reject it. The 
probability of this error (type 11) can be 
controlled at the design stage only* by 
appropriate selection of the sample size* and 
may be quite large. Thus* the trap of 
equating non-significance with no effect 
must be avoided; failure to reject the null 
hypothesis is not the same as accepting it. 

In the approach of confidence interval 
estimation no particular hypothesis is consi¬ 
dered: rather, the emphasis is on estimating 
those values of the parameter with which the 
data are consistent. These values form a 
range — the confidence interval. The range 
is calculated so that there is a high proba¬ 
bility — conventionally 95* or 99* — that 
it contains the true value of the parameter. 

A significance test is essentially a test of 
whether the data are consistent with a 
specified parameter value, and the confi¬ 
dence interval contains those parameter 
values with which the data are consistent. 
Therefore, a 5* significance test and a 95* 
confidence interval contain some infor¬ 
mation in common: significance implies that 
the null hypothesis value is outside the confi¬ 
dence interval; non-significance implies that 
the null hypothesis value is within the confi¬ 
dence interval. However, the confidence 
interval contains more information because 
it is equivalent to performing a significance 
test for all values of the parameter* not just 
a single value. A confidence interval enables 
a reader to see how large the effect may be. 
not simply whether it is different from zero. 

The limitations of the interpretations that 
are provided by a significance test may now 
be considered. _ 

The difference is significant. This means 
that there is a difference or, in other words, 
the size of the difference is not zero. We 
know no more than this. The difference may 
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be Urge and of great importance or it may 
be mail and of no practical importance. It 
is unsatisfactory that the test provides no way 
of distinguishing between these quite 
different possibilities. 

The difference Is not significant. This 
means that there is insufficient evidence to 
enable us to conclude that there is a 
difference. So the difference may well be 
zero. But this is not the same as saying that 
it is zero. The true difference may be quite 
Urge. Again, it is unsatisfactory that this 
possibility is not addressed. 

The conclusions that may be drawn from 
a significance test are considered to be 
incomplete because it is rarely that one is 
interested solely in whether a null hypothesis 
is or is not true; indeed in many cases it may 
be recognized at the outset that the null 
hypothesis is unlikely to be true. Rather, the 
question is how large is the difference and 
is it possibly large enough to be important? 
The emphasis is on measuring rather than on 
testing. The addition of the concept of an 
important difference to that of a null 
hypothesis means that there are four possible 
interpretations to an analysis: (a) the 
difference is significant and Urge enough to 
be of practical importance; (b) the difference 
is significant but too small to be of practical 
importance; (c) the difference is not 
significant but may be large enough to be 
important; and (d) the difference is not 
significant and also not large enough to be 
of practical importance. 


The size of difference that is considered 
to be large enough to be important is a 
matter for debate, and genuine differences 
of opinion may arise. It is a medical, not a 
statistical, question, although a medical 
statistician who is experienced in the subject 
area could contribute to setting a value. The 
fact that tgrcement on a unique value may 
be impossible in no way detracts from the 
argument. In fact, expressing the results as 
a confidence interval enables interpretations 
to be made for any particular value that is 
considered appropriate. 

These possibilities are illustrated in the 
Figure where the confidence intervals arc 
shown. The significant and non-significant 
cases are distinguished by the confidence 
intervals that exclude or include zero respec¬ 
tively. The main point is that in each case 
the confidence interval gives the range of 
possible values for the true difference. Of 
particular concern is fc). Here there may be 
no true difference or there may be a large, 
important difference. In other words the 
study is completely inconclusive. Such a 
possibility is missed by the simple expression 
“not significant** with its lure of equating 
this falsely with “no effect**. This situation 
will arise with a study that is carried out on 
too small a sample and this is why good study 
design demands attention to sample size to 
try to prevent the occurrence of an incon¬ 
clusive result. Altman found that it was 
common for undue emphasis to be placed on 
“negative** findings from small studies,* 



Important Not important Inconcluaiva 


Trua nagatlv* 
raault 


FIGURE: Confidence intervals showing tour possible conclusions m terms ol statistical significance 
and practical importance. 


while Freimen et al. noted that “negative*’ 
trials were often too small to constitute a fair 
test of therapies. 1 Similarly, a significance 
test will contrast (b) as significant and (d) as 
not significant but fails to recognize that they 
give essentially the same cooc fa sfcni — that 
any difference is too small to be important. 

As an example, consider tome results 
which were obtained by Gang w a y ct tL from 
a clinical trial for the management of acme 
stroke in the ekkrfy.* Of 155 patients who 
were managed in a stroke unit, 71 were 
assessed as independent when they were 
discharged from the unit compared with 49 
of 152 who were managed in a medical unit. 
The simplest analysts shows that the 
difference between the success rates of the 
two units is significant at the 1% level. 
Therefore, a genuine effect has been estab¬ 
lished. To appreciate the importance of this 
effect the advantage of the stroke unit may 
be measured by the difference between the 
two units in the percentage of subjects 
who were discharged as independent; 
50.3* - 32.2* - U.l*. This is the point 
estimate. The accuracy of this estimate is 
given by its standard error (5.5) and the 95* 
confidence limits (7.3* and 28.9*). Thus, 
the gain could be as large as 29* or as small 
as 7*. 

Recently, Gardner and Altman have 
argued against the excessive use of hypothes is 
testing and urged a greater use of confidence 
intervals. 1 In an appendix to their paper they 
give methods to calculate confidence 
intervals for the commonly occurring two- 
sample comparisons. 

In presenting the main results of a study 
it is good practice to provide confidence 
intervals rather than to restrict the analysis 
to significance tern. Only by so doing can 
authors give readers sufficient information 
for a proper conclusion to be drawn; 
otherwise readers have to rely upon the 
authors* own interpretation.* Therefore, 
intending authors are urged to express their 
main conclusions in confidence interval form 
(possibly with the addition of a significance 
test, although strictly that would provide no 
extra information). One of (he aims of the 
Journal*! statistical review process will be to 
ensure that where possible this is done. 

GEOFFREY BERRY 

Associate Professor of Bkwatistics 
School of Public Health and Tropical Medicine 
The Unhrenfry of Sydney 

1. Htaly MJR. Ii statistics a science 4 JM Statist Soc A 
1*71; Ut: 3t5)f) 

2. Ahmaa DC. Statistics W aaedical Jowaab. Sm Med 
l*S2. 1: 3**7l. 

J. Fmrnin JA, Chitaet TC. Sail H Jt. KeAkf II. 
The importance of beta, the type U error and tempi* 
tut in the doifn and murpmauoo of Utr ra n d omi s e d 
control trial. N Enf t J WW l*7|; 2W: HWW 
e Carraway WM. Akhtar AJ. Prctcott XI. Hockey L. 
Management of acute woke m the ekkrty ptfau ua iry 
multi of a controlled trial. Br Mid J IWO; 2X0: 
1040-104). 

S. Gardner MJ. Altman DG Confidence mtrrvah rather 
than P values estimation rather than bypot beau 
iruin|. Br Med J 1M*. 2»2: 744-750. 
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ENVIRONMENTAL TOBACCO SMOKE 


Scientific Method 

Scientific inquiry within an epidemiologic study begins 
with framing what is called a "null-hypothesis." The null- 
hypothesis states, in this instance, that ETS is npt associated 
with a given disease state (e.g., lung cancer, heart disease, etc.). 
Data are then collected and analyzed in order to test i.e., reject 
or accept, the hypothesis. 

One method which is used to assess the relationship of 
the collected data to a given hypothesis is the test for statistical 
significance.* Simply put, if the data examined yield a 
statistically significant result (here, the relationship between 
ETS exposure and a disease state), then the scientist is permitted, 
on the basis of those data , to reject the null-hypothesis. If the 
statistical test is not significant, then the data do not support 
rejection of the null hypothesis. 


* By convention, a 'p' (probability) value less than 0.05 is 
deemed statistically significant. A 'p* value less than 0.05 
means that the observed results would occur by chance less 
than 5 times out of 100. 

"Confidence limits" are the values between which the risk 
value can be expected to fall 95% of the time based on the 
variability of the underlying data. When the 95% confidence 
limits are both greater and less than 1.00, the risk value is 
considered not statistically significant, i.e., the results 
are likely to be due to chance and do not support a judgment 
regarding an association between exposure and disease. 
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There is no "absolute proof" involved, and there is 
nothing immutable about the concept of significance testing. 
Statistical significance is, after all, a convention. But the 
concept is illustrative, especially in the case of the association 
between ETS exposure and lung cancer. To date, there have been 28 
published reports on ETS and lung cancer, and only five have 
achieved statistical significance. It is clear that the 
preponderance of data do not permit rejection of the null 
hypothesis, i.e., there is no association between ETS exposures 
and lung cancer. In addition, virtually all of the individual 
risks reported in such studies are less than 2, which, to the 
epidemiologist, suggests a "weak" association which is probably 
the result of bias or confounding of factors unrelated to ETS. 


Inadequacies of ETS Studies 


Epidemiologic studies are notoriously unreliable 
in outcome. An observed relative risk of less 
than 1.5-2.0 (some would up to 3.0) is 
inadequate to reject the hypothesis of no 
effect. The overall relative risk calculated 
across studies is well below a minimal value 
for seriously attributing it to the presence 
of a real effect, i.e., it is within the range 
easily due to the "noise" in epidemiologic 
data resulting from the limitations and vagaries 
intrinsic to the methodology and its 
application. This same conclusion also applies 
to nearly all of the studies on an individual 
basis. Another reason for conservative 
interpretation of the ETS studies is that 
several studies are of poor quality (good 
textbook examples of. how not to . do an 
epidemiologic study) and some were originally 
designed for a different, or broader, purpose 
than assessing health risks from ETS exposure. 
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Sources of bias are present to varying degrees 
in most of the studies. Lung cancer patients 
may tend to overstate their exposure to spousal 
smoking as an explanation for their illness. 

Bias may result from depending on memory recall 
of a subject's exposure to spousal smoking. 

Estimates of relative risk may differ markedly 
between data collected from the subjects and 
data obtained from a surrogate, such as their 
children. Histologic verification of lung 
cancer was not conducted in all studies and 
the error rate may be substantial, e.g., 13% 
of the lung cancer cases in the case-control 
study of Garfinkel et al. were found to be 
incorrectly diagnosed when the histology was 
reviewed by one of the authors. (From: 

Summary of Public Docket Comments, Draft Risk 
Assessment, U.S. EPA, Dec. 1990.) 

Peter Lee, a statistician and epidemiologist from the 
United Kingdom, has argued that the increased risks reported in 
various epidemiologic studies are the result of an inherent bias 
in study design rather than the result of any genuine effect from 
exposure to ETS. 1-5 Lee presents data which indicate that the re¬ 
ported risks cannot be explained on the basis of either ETS expo¬ 
sure or dose for the nonsmoker. It is Lee's contention that the 
reported "risks" are the result of bias caused by a small number 
of smokers who are misreported in the studies as nonsmokers. 

Other kinds of misclassification may contribute to the 
reported increase in lung cancer risks among nonsmokers, according 
to several scientists. For example, none of the studies on ETS 
and lung cancer provides direct observational information on ETS 
exposures. Instead, spouses, next—of—kin or friends are asked to 
estimate the amount of ETS to which they think the subject was 
exposed. Such estimates may lead to a kind of misclassification, 
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called exposure misclassification,® which has been shown by 
Garfinkel 7 , Friedman 8 and others to lead to improper indices of 

exposure and incorrect estimations of risk. In Garfinkel's study, 
for example, relative risks varied from 0.83 and 0.77 when the 
women with lung cancer or the husband was the respondent, to a 
risk of 3.57 when a son or daughter responded. 13 That means that 
the reported risk for lung cancer in the women exposed to ETS was 
less than for women not exposed when either the women's or their 
husband's estimates were used. 

Dr. S. James Kilpatrick, a biostatistician from the 

Medical College of Virginia, has analyzed another form of misclassi- 

fication, called differential misclassification, which results 

"from the tendency of respondents to inflate the amount of ETS 

exposure for lung cancer cases and deflate the report of exposure 

for controls." 6 Similarly, Dr. Ernst Wynder, President of the 

American Health Foundation, notes that "relatives of a nonsmoking 

lung cancer patient are more likely to report passive inhalation 

exposure on the part of their relative than are relatives of a 

14 

control patient." 

A more subtle form of potential bias is known as 
"publication bias", which stems from the apparent failure by 
journals to publish studies which report negative or weakly positive 
results. 15 ' 16 Scientists have recently expressed concern over the 
growing trend among such journals to overemphasize (and hence to 
publish) only those studies which report positive increases in 
risk. 17 ' 18 Published studies which are combined for meta-analyses 

—X 
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therefore may not truly represent all investigations on the issue 
of ETS exposures and lung cancer. 

Most of the epidemiological studies on ETS and lung 

cancer have failed to consider age differences, diet, occupation 

and exposures to indoor or outdoor pollution as potential 

confounding elements. The importance of such factors is 

19— 

underscored by recently published reports from Japan and China. 

^ The reports suggest that indoor pollution generated by kero¬ 
sene heaters, coal stoves, liquified petroleum gas and exposures 
to cooking oil vapors may be responsible for the increased risk of 
lung cancer among Oriental women. Moreover, in 1989, researchers 
in the U.S. reported that nonsmokers living with smokers consumed 
less carotene (Vitamin A) than did nonsmokers who lived with other 
nonsmokers. They concluded that "dietary beta-carotene intake is 

a potential confounder and should be measured whenever possible in 

25 

studies of the relation between passive smoking and lung cancer." 

Dr. Karl Uberla of Germany recently explained why any 

attempts to generalize about the significance of reported results 

of epidemiological studies on ETS and nonsmoker lung cancer will 

likely remain unconvincing, due to scientific deficiencies in each 

of the studies. 26 He wrote: 

The majority of criteria for a causal 
connection are not fulfilled. There is no 
consistency, there is a weak association, there 
is no specificity, the dose-effect relation 
can be viewed controversially, bias and 
confounding are not adequately excluded, there 
is no intervention study, significance is only 
present under special conditions and the 
biologic plausibility can be judged 
controversially. 
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Given these difficulties in interpretation, it is 

therefore not surprising that an eminent statistician should 

conclude that "it is unlikely that any epidemiological study has 

been, or can be, conducted which could permit establishing that 

the risk of lung cancer has been raised by passive smoking. 

Whether or not the risk is raised remains to be taken as a matter 

15 

of faith according to one's choice." 

Thus, proponents of the ETS health issue are confronted 
with weak associations and generally statistically nonsignificant 
risks in epidemiological studies on ETS. They are nevertheless 
forced to posit a causal mechanism for their theoretical model 
regarding health risks. They find no support in data from the 
actual exposure studies on ETS which suggest that an average 
nonsmoker is exposed, for example, to the nicotine equivalent of 
one one-hundredth to one one-thousandth (or less) of a single 
cigarette per hour. Such exposure data suggest that there is no 
conclusive biological plausibility to the ETS health claim, and 
that the reported risks in epidemiological studies may be 
artefactual, and probably due to bias and unconsidered confounders. 
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Environmental Tobacco Smoke and Lung Cancer: 
A Critical Assessment* 

E.L.Wynder and G.CKabat 


Summary 

The possibility that exposure to environmental tobacco smoke (JETS) may increase the 
lung cancer risk of nonsmokers has become a cause of public concern. It is unknown 
whether the levels of carcinogens in the diluted sidestream smoke of tobacco products! 
that reach the nonsmoker's lung are sufficient to induce cancer. Available epidemiologic 
studies suggest a slight increase in the relative risk of lung cancer in nonsmokers due to 
exposure to ETS created by a smoking spouse. However, not all studies have found a 
significant association. The epidemiologic studies are examined in the light of the criteria 
of judgment of causality, including strength of association, consistency, temporality, 
methodological issues, and biological plausibility. Suggestions for further research, 
including studies in high-exposure populations and greater attention to histology, are 
proposed. 


Introduction 

Epidemiologists, chemists, biologists, physiologists, physicians, and public health 
officials have given much attention to the association of environmental tobacco smoke 
(ETS) exposure and the development of lung cancer in nonsmokers. A biological basisl 
/For such an association clearly exists because smoke constituents demonstrated to be 
Lcarcinogenic in laboratory animals are inhaled and retained by the nonsmokerj 
Metabolites of tobacco-specific smoke constituents have been identified in the saliva, 
blood, and urine of nonsmokers after exposure to ETS (Greenberg et al. 1984 ; Hoffmann 
et al. 1984 ; National Academy of Sciences 1986 ; USDHHS 1987 ; Sepkovic et al. 1988 ). 
Several epidemiological studies have found a positive association between ETS exposure 
- usually defined as being due to a smoking spouse - and lung cancer (Hirayama 1981 ; 
Trichopoulos et al. 1981 ; Correa et al. 1983 ; Sandler et al. 1985 ; Garfinkel et al. 1985 ; 
Akiba et al. 1986 ; Dalager et al. 1986 ; Pershagen et al. 1987 ). Other studies have found no 
significant association (Garfinkel 1981 ; Chan and Fung 1982 ; Koo et al. 1983 ; Rabat and 
Wynder 1984 ; Wu et al. 1985 ; Lee et al. 1986 ). No consistent association has been 
reported for lung cancer and exposure to ETS in childhood, which might be expected to^ 
jixcrt a greater effect "especially when followed by exposure throughout adult hoodT(5f 
course, recall of ETS exposure in childhood is more difficult tha n recall of such exposuTT 
Tn adulthood. *“ 


* Research described herein was performed under USPHS, National Cancer Institute Program 
Project Grant CA-32617. 


H. Kasugt (Ed.) Indoor Air Quality 
© Springer-Verlag, Berlin Heidelberg 1990 
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The epidemiological study of weak associations is burdened with problems that may 
yield artifactual positive findings or may show negative findings where a real association 
exists. The association of ETS and lung cancer risk, even if weak, would still be of concern^ 
as a public health problem in that most people are at one time or another exposed to \ 
smoke from burning tobacco products and the exhaled pollutants of tobacco smokers. A 
weak association in epidemiology requires careful examination and an understanding of 
the variables in question and all of the factors influencing the association (Wynder 1987). 

In this overview we critically examine the published studies on ETS exposure and lung 
cancer to determine whether the evidence presented to date permits a sound conclusion as 
to causation. 


General Exposure to ETS 

At the outset we need to emphasize that an association between ETS and lung cancer 
must be deemed possible. A recent survey of self-reported exposure in a hospitalized 
population revealed that 66% of men and 60% of women had ETS exposure in 
childhood; 32% of the men and 61 % of the women reported ETS exposure in the home in 
adulthood; and 60% of the men and 62% of the women who worked outside the home 
reported ETS exposure at work (Kabat and Wynder, unpublished data, 1987). 


Critical Assessment 

The first Surgeon-General’s Report on Smoking and Health, published in 1964 (USPHS 
1964), clearly delineated the criteria of judgment for causality. These criteria included: 
the magnitude of the association, consistency, temporality, and biological plausibility. 
Since these criteria were considered necessary to prove causation for a strong association, 
namely, active smoking and lung cancer, they should be equally required to determine the 
causality of weak associations (Wynder 1987). Let us examine the epidemiological 
evidence linking ETS with lung cancer in respect to these criteria. 


Strength of the Association 

An association is generally considered weak if the odds ratio is under 3.0 and particularly 
when it is under 2.0, as is the case in the relationship of ETS and lung cancer (Table 1). If 
the observed relative risk is small, it is important to determine whether the effect could be 
due to biased selection of subjects, confounding, biased reporting, or anomalies of 
particular subgroups. 


Consistency 

If an association is real, internal consistency should be apparent within and between 
different studies. The majority, but not all of the studies of ETS and lung cancer have 
shown a positive association for ETS-exposure due to a smoking spouse (Table 1). In 
most of the studies, the confidence interval includes 1.0. While the prospective study by 
Hirayama (1981a) among Japanese women showed a significant association with the 
husband’s smoking (largely adenocarcinomas), the prospective study among American 
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Table 1. Summary of results of studies relating lung cancer risk in married women to their 
husbands' smoking habits 



Relative risk 

95% Confidence interval 

Prospective studies 

Hirayama (1981) 

1.63 

1.25-2.11 

Garfinkel (1981) 

1.18 

0.90-1.54 

Case-control studies 

Trichopoulos et al. (1981) 

2.1 

1.18-3.78 

Chan & Fung (1982) 

0.75 

0.44-1.30 

Correa et al. (1983) 

2.03 

0.83-5.03 

Koo et al. (1983) 

1.54 

0.90-2.64 

Kabat & Wynder (1984) 

0.79 

0.26-2.43 

Wu et al. (1985) 

1.2 

0.6 -2.5 

Garfinkel et al. (1985) 

1.12 

0.74-1.69 

Lee et al. (1985) 

1.03 

0.41-2.47 

Akiba et al. (1986) 

1.48 

0.88-2.50 

Fershagen et al. (1987) 

1.28 

0.75-2.16 


Table 2. Distribution of lung cancer by histologic groups in smokers and never-smokers. (From 
Kabat and Wynder 1984) 


Smokers 


Never-smokers 

Males 

Females 

Males 

Female* 

(N = 1882) 

(N = 652) 

(N = 37) 

(N * 97) 

t%] 

[%3 

t%] 

[%] 


Kreyberg I 

63 

52 

35 

21 


Kreyberg II 

32 

43 

54 

74 


Mixed and undifferentiated/anaplastic 

5 

5 

11 

5 



women by Garfinkel (1981) did not. It has been suggested that Japanese and American 
women are exposed to different levels of ETS due to different conditions in the two 
countries. Such differences could account for this disparity (Hirayama 1981b). 

Within those studies presenting specific histologic analysis, differences exist in 
respect to the type of lung cancer involved. In active smokers, tobacco smoke exposure 
has a causative effect predominantly on^quamous and small cell types of lung cancer 
(Kreyberg I), with a lesser, though still significant causative effect on the glandular type 
(Kreyberg II) (Wynder and Stellman 1977). Among nonsmokers, however, the glandu¬ 
lar type of lung cancer predominates among both men and women (Kabat and Wynder 
1984) (Table 2). The effect of ETS would thus be expected to be primarily responsible 
for the higher rate of adenocarcinomas among nonsmokers. The studies by Dalager 
et al. (1986) and Fershagen et al. (1987), however, suggest that the effect of ETS 
exposure is limited to induction of squamous cell lung cancer (Table 3). If this were, in 
fact, the case, then only the squamous or small cell type of lung cancer in nonsmokers 
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Table 3. Histology-specific odds ratios for spouse smoking from two studies 


Study 

Histologic type 

N 

Odds ratio 

95% C.I. 

Dalager et al. 

(1986) 

Adenocarcinoma 

16 

1.02* 

0.33- 3.16 


Squamous & Small Cell Ca. 

14 

2.88* 

0.91- 9.10 


Other 

18 

1.31* 

0.48- 3.57 

Fershagen et al. 
(1987) 

Squamous or Small Cell Ca. 

20 

3.3 

1.1 -11.4 


Other 

47 

0.8 

0.4 - 1.5 


would be affected by ETS. Clearly, it is important that investigations of the effect of 
ETS exposure on lung cancer development in nonsmokers take histology into account, 
so as to determine whether an effect of ETS is limited to certain histological types. 

Since smoking is more prevalent in lower income groups, at least among men, 
lung cancer in nonsmoking women in these groups Should have a higher incidence. 
Thus, the influence of the level of education on smoking habits in the examined 
population needs to be considered as a possible confounder. Few studies to date 
have done this. 


Methodological Issues 

.A particular concern in weak associations is reporting bias, that is. potentially 
differential reporting of exposures between cases and controls. In terms of ETS, does the 
lung cancer patient report exposure to tobacco smoke, be it at work, at home, at social 
functions, in childhood or adulthood, differently than the control? The case is likely to 
have a different attitude toward this question than does the control, a handicap not 
applicable to prospective studies. It needs to be determined whether the case’s attitude 
towards questions on ETS exposure leads to under* or overreporting. Cases are likely to 
underreport their own smoking (Lee 1987), and they may tend to overreport their 
exposure to ETS and other potential hazards that could account for their illness. In 
studies that use proxy reports, different relatives may respond differently. Garfinkel et al. 
(1985) provides some insight into this phenomenon by showing that if the response came 
from the patient, the odds ratio was 1.0, if from the husband it was 0.92, and if from the 
daughter or son, 3.19 (Table 4). More work is needed on the validity of ETS-exposure 
information obtained from different relatives before we can evaluate which of these 
relative risks is closer to the truth. 

In general , possible reporting bias represents a serious problem in case-control studies 
because it can producer systematic artefact. It is particularly worrisome in that it cannot 
be effectively measured. 

We also need to consider misclassification that can occur in both retrospective and 
prospective studies. Lee has proposed (Lee et al. 1986; Lee 1987) that the reported ETS 
effect on lung cancer risk can be explained by a misclassification of smokers as 
nonsmokers. According to these studies, a substantial percentage of respondents 
misrepresent their smoking habits. Using a 10.0% misclassification rate of ex-smokers as 
self-reported neversmokers coupled with the concordance of spouses* smoking habits. 
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Table 4. Data from Garfinkel et aL (1985) by type of respondent 


Husband's smoking habits at home 



in i6.o 

s 

o 12.0 


Kr«yb*rg R 



W* 1-3 4-4 7-10 11.15 IS* 

•**•*•» Sroifi 

YEARS SMCE OUTTMQ 

Fig. 1. Odds ratio of male ex-smokers for Kreyberg I (N = 687) and Kreyberg II (N = 301) lung 
cancer by years since quitting (controls = 6534). Source: American Health Foundation data 


Lee calculated that an apparent increase in lung cancer risk can be obtained among 
nonsmokers married to smokers that approximates the increased risk observed in a 
number of epidemiologic studies (Lee 1987). At the extreme, Garfinkel et al. (1985) 
showed that 40% of lung cancer cases classified as "nonsmokers" in the hospital chart 
were in fact smokers as determined by interview. Although such a high rate of 
misclassification does not occur when cases are interviewed personally, to some extent 
denial is likely to occur even then, particularly among ex-smokers who had stopped 
smoking ten or more years ago. The risk of lung cancer among long-term ex-smokers, 
and even among ex-smokers who quit more than 16 years earlier, does remain elevated 
above the rate among those who never smoked (Fig. 1). Denial of past smoking may also 
not be uncommon in populations where smoking is or was socially unacceptable, as is the 
case among older Japanese women. 
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Table 5. Percent of lung cancer cues who never smoked by histologic group (A.H.F. data) 



Males 




Females 



KI* 


KII** 


KI* 


KII** 


t%] 

N 

t%] 

N 

t%] 

N 

t%] 

N 

1969-1973 

1.2 

488 

5.6 

142 

10.7 

103 

23.7 

76 

1974-1976 

1.6 

887 

3.0 

305 

16.4 

263 

25.3 

146 

1977-1980 

2.1 

628 

4.6 

390 

5.6 

231 

22.0 

245 

1981-1985 

1.4 

725 

5.6 

463 

6.8 

311 

16.6 

284 


• Kreyberg I 
•• Kreyberg II 


Another problem for epidemiologists involves subgroup analysis (Stallones 1987). 
Investigators are likely to examine numerous subgroups, and then prefer to present those 
subgroups that best fit the hypothesis. This tendency represents an inherent problem in 
epidemiology. The investigator should at a minimum give an idea of how many 
subgroups were originally examined and how many subgroups were discarded. 


Temporality 

One of the factors that led to the conclusion that active smoking causes lung cancer was 
that the increase in cigarette consumption preceded the increase in lung cancer rates, first 
in men and later in women. Enstrom (1979) has reported an increase in the lung cancer 
rate in nonsmokers over recent years, suggesting that factors in addition to personal 
cigarette smoking influence lung cancer mortality rates. The groups examined, however, 
are not strictly comparable, and misclassification of smokers as nonsmokers in the 
national surveys needs to be considered. Our data from a long-term, hospital-based case- 
control study do not indicate an increase in the percentage of male nonsmokers with lung 
cancer in either of the two main histologic groupings (Kreyberg I and II) over the last 30 
years (Table 5). 

In fact, the percentage of nonsmokers with lung cancer among women has declined, 
which may be a consequence of the diminishing pool of women who have never smoked. 


Biological Plausibility 

Several studies have demonstrated that most tumorigenic agents are present in undiluted 
sidestream smoke in higher concentrations than in mainstream smoke (Hoffmann et al. 
1983; National Academy of Sciences 1986; Hoffmann and Wynder 1986) (Table 6). 
Biochemical studies indicate that nonsmokers exposed to ETS have levels of nicotine or 
cotinine in the blood or urine that are about 1/100th the level seen in active smokers 
(Table 7) (Jarvis et al. 1984; National Academy of Sciences 1986). Some of the nicotine 
measured in the blood and urine represents nicotine that is absorbed by the saliva of 
nonsmokers and does not reach the lung directly (Jarczyk et al. 1987). It is important to 
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Table 6. Distribution of compounds in undiluted cigarette mainstream smoke (MS) and sidestream 
smoke (SS) 

Nonfilter cigarettes 



MS 

SS/MS 


(A) Vapor phase 

Carbon monoxide 

10 - 23 mg 

2.5- 

4.7 

; ! 

4 

Carbon dioxide 

20 - 40 mg 

8 - 

11 

? 

Benzene 

20 - 50 pg 


10 


Formaldehyde 

5 - 100 pg 

0.1-**50 


Acrolein 

50 - 100 pg 

8 - 

15 


Acetone 

100 - 250 |ig 

2 - 

5 


Hydrogen cyanide 

400 - 500 p* 

0.1- 

0.25 


Hydrazine 

24 - 43 ng 

3.0 



Ammonia 

50 - 170 pg 

40 - 

170 

r > 

Methylamine 

11.5 - 28.7 pg 

4.2- 

6.4 


Nitrogen oxides 

50 - 600 pg 

4 - 

10 


N-nitrosodimethylamine 

10 - 180 ng 

20 - 

100 

. 

N-mtrosopyrrolidi ne 

2 - 110 ng 

6 - 

30 


(B) Particulate phase 

Particulate matter 

15-40 mg 

1.3- 

1.9 

N 

Nicotine 

1 - 2.5 mg 

2.6- 

3.3 

* 

Phenol 

60 - 140 pg 

1.6- 

3.0 


Catechol 

100 - 350 pg 

0.6- 

0.9 

’• 

Hydroquinone 

110 - 300 pg 

0.7- 

0.9 


Aniline 

360 ng 


30 


2-Toluidine 

30 - 160 ng 


19 


2-Naphthylamine 

4.3 - 27 ng 


30 


4-Aminobiphenyl 

2.4 - 4.6 ng 


31 


Benz(a)anthracene 

40 - 70 ng 

2 - 

4 


Benzo(a)pyrene 

10 - 40 ng 

2.5- 

3.5 


N'-Nitrosonomicotine 

120 -3,700 ng 

0.5- 

3 

; 


NNK 

120 

- 950 

ng 

1 - 4 

Cadmium 


100 

ng 

12 

Nickel 

20 

-3,000 

ng 

13 - 30 

Polonium-210 

0.03- 1.0 pCi 

7 


note that nicotine occurs in ETS primarily as a vapor phase constituent rather than in the 
particulate matter of the aerosol as is the case in mainstream cigarette smoke (Eudy et al. 

1987). Measurement of nicotine or its metabolites will, therefore, not reflect the 
proportional uptake of particulate matter from ETS. In the light of our present 
knowledge of dose-response in carcinogenesis and because the carcinogenic activity of 
t obacco smoke as measured in animal systems is relatively low, the question needs to be 
raised whether die carcinogenic potential of inhaled ETS suffices to induc e^lung cancerT 
Hoffmann and Hecht (1985) have proposed nicotine-derived nitrosammts Inti'S as 
organ-specific carcinogens for the lung. It is possible that these chemicals reach the lungs 
in sufficient dose to induce neoplastic changes. These carcinogens may also be formed 
endogenously from inhaled or ingested nicotine and appropriate nitrosating agents 
(Hoffmann and Hecht 1985). Tumor promoters are less likely to play a role in ETS 
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Table 7. Approximate relations of nicotine as a parameter between non-smokers, passive smokers 
and active smokers*. (From Jarvis et al. 1984) 


Nicotine/cotinine 

Non-smokers without 
ETS exposure 

No. = 46 

Non-smokers with 

ETS exposure 

No. = 54 

Active 

smokers 

No. = 94 

Mean 

value 

% of active 

smokers 

value 

Mean 

value 

% of active 

smokers 

value 

Mean 

value 

Nicotine (ngfml) 






in plasma 

1.0 

7 

0.8 

5.5 

14.8 

in saliva 

3.8 

0.6 

5.5 

0.8 

673 

in urine 

3.9 

0.2 

12.1* 

0.7 

1,750 

Cotinine (ngfml) 





275 

in plasma 

0.8 

0.3 

2.0* 

0.7 

275 

in saliva 

0.7 

0.2 

2.5** 

0.8 

310 

in urine 

1.6 

0.1 

7.7** 

0.6 

1,390 


* Differences between non-smokers exposed to ETS compared with non-smokers without 
exposure 

• p<0.01 
*• p< 0.001 


carcinogenesis than in active smoking because of their much lower concentration. In 
general, tumor promoters are effective only when applied repeatedly in relatively large 
amounts. 

In considering the existing data on ETS exposure and lung cancer, it is noteworthy 
that Auerbach et al. (1961) showed only minor histological changes in the bronchial 
epithelium of nonsmokers and found that the ciliated columnar epithelium that covers 
their bronchi were largely intact. Deposition of carcinogenic smoke particulates can take 
place only upon inhibition of the protective functioning of the lung clearance system. 
Squamous cell lung cancer can arise only from ciliated columnar cells that have 
undergone squamous metaplasia. 

An active smoker with each puff from a cigarette inhales a volume of 35-50 ml of a 
concentrated aerosol containing 3-5 billion particles per ml that adversely affect the 
protective cilia and mucous defense system of the bronchi (Ferin et al. 1965). The passive 
smoker is at no time exposed with such force to such a highly polluted inhalant. 
Furthermore, ETS particles are more likely to be deposited in the upper respiratory tract 
and not predominantly in the bronchi as is the case in active smoking. Thus, our 
respiratory defense system may be able to deal more readily with the relatively lighter 
deposition of particles and exposure to volatiles in ETS, as the observation by Auerbach 
et al. (1961) would suggest. 


Future Studies 

Future epidemiological studies on the association of ETS with lung cancer should 
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human cancer requires support from descriptive, metabolic, and molecular epidemiolo¬ 
gy* 

Beyond extension of prospective studies, such as those now in progress by Garfinkel and 
Stellman at the American Cancer Society, we suggest: 

1) Continuing ongoing case-control studies with special reference to histologic type and 
careful consideration of methodological issues. 

2) Estimating the relative importance of ETS exposure in different settings - in the 
home, in the workplace, in social situations, and during transportation. 

3) Further studying lung cancer rates among pipe and cigar smokers, and, if feasible, 
among nonsmokers exposed to ETS from these products. 

4) Studying lung cancer incidence in groups occupationally exposed to high levels of 
ETS at their worksite such as waiters, bartenders, train conductors, airplane 
personnel, and office workers. 

5) Studying bronchial epithelium in autopsy material of established never-smokers 
whose exposure to ETS is known. 

6) Determining the incidence of lung cancer by histological type in confirmed never- 
smokers. 

7) Comparing the presence of adducts of tobacco-specific carcinogens with DNA in 
smokers, passive smokers, and “never-smokers" (Hoffmann andHecht 1985; Hecht et 
al. 1987). 

In summary, verification of the possible association of ETS and lung cancer represents an 
important challenge to epidemiologists, laboratory scientists, and public health authori¬ 
ties. The public is entitled to inhale the cleanest possible air regardless of whether ETS is/ 
^proven to be cancer-inducing. Additional efforts on the part of epidemiologists are 
required to firmly establish the nature and significance of the reported associations 
between passive smoking and lung cancer. 
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What Is the Epidemiologic Evidence for a Passive Smoking- 
Lung Cancer Association? 

N. Mantel 


1 

, Summary 

. Two survey articles of reports on the association of passive smoking with lung cancer 

. have recently appeared, and also a comprehensive report on the subject of environmen- 

. tal tobacco smoke by a committee of the National Research Council of the United 

States. The observed excess over a relative risk of unity cannot be explained by chance. 

* Nor can it be fully accounted for by a particular source of bias, the false claims of being 

* non-smokers by individuals who were active or ex-smokers. That possible source of 

* bias leads, in one summary survey, to reducing a relative risk of 1.35 to 1.30, but from 

* 1.34 to 1.15 in the National Research Council report. The latter report suggests that 

| statistical significance would no longer obtain, perhaps, particularly, because of other 

* possible biases. However, to get an estimate of the correct relative risk due to passive 
smoking, allowance has to be made for actual exposure to passive smoking of those not 
exposed at home. Thus, the 1.30 is adjusted upwards, by 18 in one survey, to 1.53, but 
by only 8% in the National Research Council report to 1.24. The National Research 
Council report had given an anticipated relative risk of 1.1 based on dosimetric 
considerations. But it is suggested here that that could be as low as 1.05, too low to be 
detected in an epidemiologic investigation - in any case it would be based on 

. hypothetical assumptions. 

, In November of 1986 there were two near-simultaneous review articles addressing the 

, subject of passive smoking and lung cancer. One was an invited guest editorial by Blot 

, and Fraumeni in the Journal of the National Cancer Institute, the other a contemporary 

, theme discussion by Wald et al. in the British Medical Journal [1, 2]. 

, There was substantial overlapping in the two articles of the various publications on 

. the subject, and on the basis of which the conclusion of a significant positive association 

* was made. The article by Wald et al. gave, perhaps, more statistical detail about the 

* results of the several studies covered. But, to my mind, there was uncritical acceptance of 

* the results of all the studies. Blot and Fraumeni did suggest that there were some flaws in 

* a particular study, that by Hirayama [3], but decided that any inherent biases in that 

* investigation could not have given rise to the observed elevated risk. 

4 From their overall evaluation of 10 case-control studies (all 10 gave results for 

! females, five separately for males as well) and three prospective studies (two of these 

’ covered males separately), which provided 20 separate relative risk (actually odds ratio) 

1 values, Wald et al. came up with a summary relative risk of lung cancer due to passive 

] smoking of 1.35 (95% limits 1.19 to 1.54). They trim this down to 1.30 on the basis that 

some of the presumed non-smokers exposed to passive smoking were actually smokers. 
, Then, on the added basis that even those unexposed to passive smoking at home may still 

1 have been exposed when away from home, they raise their estimate of relative risk to 1.53. 

, But note that this last modification presupposes the answer, that passive smoking does 
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342 N. Mantel 

elevate the risk. For if it did not, there would be no basis for adjusting the 1.30 or 1.35 
upwards to 1.53. 

Blot and Fraumeni come up with a similar summary measure of relative risk for 
passive smoking of 1.3 (95 % limits of 1.1*1.5), but elevated to 1.7 (95% limits of 1.4-2.1) 
for heavy passive smoking. These authors suggest that heavy passive smoking is 
equivalent, at least in terms of nicotine received, to smoking between 1/2 and 3 cigarettes 
daily, and estimate that smoking a few cigarettes daily would give rise to a relative risk of 
about 1.5-fold to twofold. 

While Blot and Fraumeni do not address the question of correct reporting of non¬ 
smoking status, Wald et al. do, having used this as a basis for lowering the relative risk 
estimate from 1.35 to 1.30. Based on reports and communications from others, Wald et 
al. estimate that persons reporting themselves as never having smoked Oblong non- 
smokers) comprise 2.1% active smokers plus 4.9% former smokers, for a total of 7 % ever 
smokers among the self-claimed never smokers. Wald et al. estimate that these 7% have a 
combined relative risk of 2, making the assumption in doing this that the active smokers 
among the 7 % smoked on average only a quarter as much as active smokers generally. 
The relative risk of 2 for the 7% is computed as a weighted average of 3 for active 
smokers, 1.5 for former smokers, among the 7%. 

If 7 % of reported never-smokers were actually ex-smokers or active smokers, which 
were they - the spouses, say, of smokers or the spouses of non-smokers? In my own 
critique of Hirayama, I had suggested that this false reporting of non-smoking status 
woutd preferentially be among those with smoking spouses [4]. If, for example, the 7% 
overall misreporting of non-smoking status concentrated among spouses of smokers, it 
would be somewhat higher among persons with smoking spouses who, nevertheless, 
claimed to be never smokers. Suppose we take it at 20%, in which case the reported 
lifelong non-smokers relative risk would be 1.20. It could be substantially higher but for 
the assumption by Wald et al. that the active smokers among the reported never smokers 
had sharply reduced levels of smoking. However, Wald et al. were ready to make only a 
small reduction in relative risk for this factor, from 1.35 to 1.30. Their speculative 
increase, which might have no basis at all, was much greater, from 1.30 to 1.53. 

The effect of false reporting of smoking status, specifically of non-smoking, could be 
much sharper than what Wald et al. have suggested. In a study of biochemical markers of 
smoke absorption, Jarvis et al. branded as “deceivers" 21 individuals who claimed to be 
non-smokers [5]. These 21 displayed biochemical patterns very similar to those of actual 
smokers, not at all like those of accepted non-smokers. The 100 accepted non-smokers 
comprised 46 without passive smoking, 54 with. Those 21 would constitute 21/121 or 
about 17 % of the total, and these would be active smokers, not just former smokers, or 
eightfold greater than the 2.1% Wald et al. postulated. Perhaps in the epidemiologic 
investigations made, false reporting of non-smoking status is at a much lower level, but it 
would not take much false reporting to account fully for the seeming association between 
passive smoking and lung cancer. 

Recently, a colleague expressed to me the thought that if passive smoking played no 
rol e in lung canc er, why are we not finding many negative associat ions, nor any 
significantly nega ti ve associations? Actually, six of the 20 relative risk sj eported in Wald 
~et al. are at 1.00 or smaller. And some of those reported asTn excess of l\00 conceal ra tes 
of under 1.00. Thus, relative to the rate shown of 1.23 for the study reported by Garfmker 
et al. , I have brought out in my own critique that that repre sented a composite of data lor 
'various classes of respondents [6, 7]. Where the^woma n with lung can cerwas nerseii the 
respondent (as to her h u sband 's level of smoking) the relative risk was 0.83. Using the 
husbands* responses, the relative risk^as 0.77. It was only on tne basis Of responses by 
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the sons and daughters, at a time long past when they would have left home, that^s. 
“rclativensk of 3.$7 emerged , sufficien tly high to raise the ovcrall estimate of relative risk 
tol.23. As I indicated i n mv critique, the replies by the children were more accusatoryj n 
nature than revealing of any true r elationship. 

But even so, it would take 40 large studies to get on average a single seemingly 
significant negative association of lung cancer with passive smoking, assuming statistical 
testing at the 5%, two-tailed, level. But we have only 20 evaluations, with many so small 
that they could not possibly yield any apparently significant protective effect, not even in 
the unrealistic situation that passive smoking was 100% protective. Suppose a study had 
a null expectation of only 2 or 3 passive smokers with lung cancer - then there would be 
some observed number, 5 or 6 or 7 or 8 or more which would be significantly in excess of 
expectation. But there would be no number, however small or even zero, which would be 
significantly below expectation . Yet just such low expectations characterize several of the 

* studies reported on by Wald et al. In one study, a relative risk of 2.29 is shown based on 

, only 2 actual cases of lung cancer in passive smokers, expectation 1.20. Another relative 

. risk of 2.45 is based on 3 observed, 1.77 expected. For one prospective study, 4 observed 

. cases have given rise to an estimated relative risk of 3.25, and in another 7 observed cases 

« gave rise to a relative risk of 2.25, suggestive of an expectation little in excess of 3. On the 

* other hand, the four reported risks of under 1.00 had expectations variously of 37.67, 
34.08, 6.64 and 13.77. 

* Of concern to Wald et al. was whether the various relative risks were homogeneous. 

* On this point they cite a chi-square test for heterogeneity of 20.0 on 19 degrees of 

* freedom, p > 0.2. However, this is not so much evidence of homogeneity of relative risks 

* as it is reflective of the high unreliability of the individual relative risks. For 8 of the 20 
relative risks shown, the upper limit on the relative risk exceeds the lower limit by a factor 
of about 10 or more, that factor attaining a value of 57 in one instance. 

Blot and Fraumeni express concern about other long term consequences of passive 
smoking, particularly in connection with coronary artery disease. They cite a report by 
| Garland et al. [8] who initially reported a relative risk due to passive smoking of death 

from ischemic heart disease of 14.9, but seem unaware that the estimate of 14.9 has been 
, revised downward to 2.7. In the report of the National Research Council [9], which I will 

*, be discussing below, there is awareness of the downward revision, but not of the fact that 

, the suggestive significance of p < 0.10 is lost and becomes p < 0.20. 

. That lung cancer may aggregate in families is also of concern to Blot and Fraumeni, 

i who cite Ooi et al. on the subject [10]. Elsewhere, and yet to appear, I have suggested that 

i apparent familial aggregation, in the instance breast cancer, may be a reflection of an 

i awareness bias rather than of true familial aggregation [11]. If information about 

* relatives is not collected more directly, the apparent aggregation based on reports from 

* the Index case may only reflect heightened knowledge by such cases of similar illnesses 

* about relatives. But the report by Ooi et al. is another instance, like that of Garland et al., 

1 in which there has been unreliable statistical evaluation. Thus, Ooi et al. initially reported 

that the lung cancer risk increased eighteen-fold per 10-year age increase. By letter in the 
6 October 1986 issue of the Journal of the National Cancer Institute they have revised that 

factor downwards, giving separate factors for each 10-year age interval. From age 50 to 
age 60, the factor is now reported at only 2.9. 
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The Report of the Committee on Passive Smoking, Board on Environmental 
Studies and Toxicology, National Research Council [9] 

I have chosen to discuss the epidemiologic aspects of this Report separately, since it is 
essentially the definitive work on current knowledge on environmental tobacco smoke. A 
member of the committee was Nicholas Wald, senior author of one of the articles 
discussed above. The report contains a technical appendix which largely duplicates the 
appendix in the article by Wald et al. and also repeats, with minor variations, the data of 
Wald et al. The body of the report itself contains those same data, but recast differently, 
and it is the same 13 studies, with 20 relative risk values, which underlie the epidemiologic 
aspects of the Committee Report. 

There are a great variety of issues which the Committee Report goes into, whether 
physiochemistry, toxicology, assessment of exposures, use of questionnaires, exposure- 
dose relationships, etc. But my concern at this time is the epidemiology. There could be a 
point to estimating the annual number of lung cancer deaths in the United States due to 
passive smoking, but that would have to be on the presumption that passive smoking 
does play a causative role. 

However, the Committee Report is quite restrained in its findings and leaves open the 
question of whether anything has been established. If the apparent relative risk is 
significantly greater than unity, the excess cannot be fully explained away by certain 
biases considered. However, whether there is statistical significance in view of those 
biases is not addressed. 

From dosimetric considerations, the Report suggests that the excess risk of lung 
cancer due to environmental tobacco smoke should be 1 % of the excess risk due to active 
smoking. This leads to a relative risk of 1.14 for men, perhaps less for women. From the 
epidemiologic data, the summary relative risk is 1.34, but it is brought out that for United 
States studies only the relative risk would be only 1.14. If only large studies are 
considered, the overall relative risk would be 1.32. 

Next addressed by the Report is the effect of biases, particularly the bias associated 
with the false reporting of in&viduals that they were not (or never have been) smokers. 
This leads to a lowering of the estimated relative risk of 1.34 (or 1.30 to 1.34) to 1.15. But 
note that on this same basis, Wald et al. were willing to reduce an apparent relative risk of 
1.35 only slightly, to 1.30. 

Yet another adjustment is made. If non-smokers are not exposed to environmental 
tobacco smoke at home, they might still be exposed to it away from home. An upward 
adjustment of 8% on account of this yields 1.15 X 1.08 = 1.24. This contrasts with the 
upward adjustment of 18% made by Wald et al., who calculated 1.30 X 1.18 = 1.53. The 
Committee Report differs markedly from the separate report made by one of its own 
members. 

In discussing Wald et al. I suggested that the upward modification they have 
presupposed a positive role for passive smoking. This same thing is true for the 8 % 
upward adjustment in the Committee Report. For purposes of evaluating the statistical 
significance of the findings, the relative risk should be taken as 1.15, though the value of 
1.24 might be appropriate for assessing the toll in excess lung cancer due to passive 
smoking assuming that there is causality. With the United States studies indicating an 
unadjusted relative risk of only 1.14 rather than 1.34, both the 1.15 and the 1.24 might be 
sharply lowered if intended to apply only to the United States. 

But let me stay with the relative risk of 1.15 prior to the 8 % upward adjustment. Is that 
relative risk significantly in excess of 1.00? I suspect not. And even the question of bias 
remains open. Both in the Committee Report and in the article by Wald et al., the only 
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biases factored in were just those that would fit into neat mathematical formulas. More 
subtle biases or ones that had not been thought of did not get in. I gave an example above 
of the use by Garfinkel et al. of the responses by sons and daughters of the level of 
smoking by the fathers. 

I might even speculate about publishing bias. If an investigator got a weakly or 
insignificantly negative result for the role of passive smoking in lung cancer, would he 
bother submitting it for publication? And if he did f would it be accepted for publication? 
Postulating this kind of bias is not necessary for establishing that the 1.15 relative risk is 
likely not significant. But I bring it up in connection with a tendency I see towards 
accepting uncritically or less critically manuscripts which are on the right side of the fence 
on the issue of passive smoking. A particular example was the publication of the article by 
Garland et al. on passive smoking and ischemic heart disease mortality, the claims of 
which fell apart on scrutiny. 

Let me bring up now another thought. Some time ago the possibility of subtle or not- 
so-subtle biases in case-control or other epidemiologic investigations was so much a 
matter of concern that it was suggested that unless the relative risk were at least 2.0, any 
increase in risk should not be accepted. Perhaps we can do better now and might employ a 
less restrictive criterion. 

But I can see no relaxation to the point of accepting the relative risks now observed for 
passive smoking in lung cancer. What we must accept is that it is unlikely that any 
epidemiologic investigation has been on can be mounted which would establish a causal 
role for passive smoking in lung cancer. Those who believe such a role exists should 
continue to believe as much, and might even hazard estimates as to the resulting toll in 
deaths and disease, with other allowed to hold contrary beliefs. What would be incorrect 
would be to claim that epidemiologic studies -have established the correctness of the 
j belief. 

; If epidemiologic investigations cannot establish a role for passive smoking, the best 

^wecan do is to make suppositions estimates of how great that role may be - and such 
suppositions estimates can be too high if any of the underlying supposals are false. One 
supposal would be that the dosage response curve is linear through the origin, another 
that some particular biochemical measurcTsay lcvelof cotin ine, is a proper measure of 
The equivalent exposu re to cigarettes of passive smoking. And, 1 point out, there could be 
The ass umpti on that the temperature at which tobacco smoke is inhaled is not rclcvantT 
though I woidd think that fresh hot smoke would be more active than stale smoke d * 

With this thought in mind/we can pick up some clues from the report of Jarviset al._ 
who r after excluding “deceivers", report average cotinine levels in plasma T saliva, and 
urine of 100 non-smokers to be at 0.55%, 0.55% and 0.364 respectively of those levels^ 
from 94 smokers. Let us take it at 0.5%. If the average cigarette smoker has a relative risk 
for lung cancer of 10.0 (enhancement of 900%, though the enhancement may be 1,400% 
for very active smokers), this would put the enhanced risk due to environmental tobacco 
s moke at 4.5%, for a relative risk of 1.045 (it would be 1.07 us ing the 1,400% 
enhancement for very active smokers). That relative risk, L045. would efldompass both 
passive smoking at home and away from home, including individuals cot expos ed to 
passive smoking at home. ~~ 

What matters, however, relative to the conduct of epidemiologic studies on the 
subject, is the differential in relative risk between those knowingly exposed to passive 
smoking and those who believe themselves unexposed. From data available in Jarvis et 
al., it would appear that those seemingly not exposed to passive smoke (46 in number) 
nevertheless have a relative risk of about 1.02. For the 54 non-smokers claimed to be 
actually exposed to passive smoking, the relative risk based on cotinine levels would, in 
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similar manner, be 1.07. Compared then to seemingly non-exposed to passive smoking, 
the calculated relative risk for the known exposed to passive smoking would be 1.05. That 
small increase in relative risk just would not show up on any epidemiologic investigation 
and would be submerged, in any case, by other very likely biases. The National Research 
Council report had suggested a relative risk, based on dosimetric considerations, of 1.14, 
but on the assumption that enhancement in risk due to an active smoking was 1,400%. 
An enhancement of 900% would have led them to anticipated relative risk of 1.09. But 
whether we use 1.05,1.09, or 1.14, the effect would still be undetectable. 

As a last point, I raise the issue of passive smoking effects on children. If parents can 
be shamed into not exposing their children to passive smoking, this is all well and good, 
even if the supporting basis is unsound. I note that the ill effects arise mostly in early 
childhood, and have two questions. Have the passive smoking effects been isolated from 
effects due to mother’s smoking prior to the child’s birth? To what extent has account 
been taken that cigarette smoking concentrates in families with lower socio-economic 
status, as evidenced by lower educational level and more unemployment etc. Rona et al. 
also brought in the factor of overcrowding at home in their report that passive smoking 
resulted in some small reduction in the stature of children [12]. But even Rona et al. failed 
to take properly into account, as I have suggested, the role of some of these important 
factors on smoking rates in their evaluation [13]. 

What with subtle biases, not so subtle biases, and even extravagant errors, one should 
not accept too readily claimed demonstrations of ill effects of passive smoking. Passive 
smoking has been the favorite whipping boy_of epidemiologists for too long already . Thc’j 
["public is entitled not to be unnecessarily exposed to environmental tobacco sm okejbut 
any panic is unjustified. 
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GLOSSARY 


Acute: Having a short course; of short duration. 

Animal study: A controlled laboratory experiment in which animals 
are exposed to an agent and the biological effects of this 
exposure are assessed. The exposure may be via food or water 
(ingestion), by injection, by external application or by 
inhalation. Typical effects that might be measured are tumor 
incidence or tissue and organ changes. 

Bias: Regarding epidemiologic studies, the operation of factors 

in a study's design or execution that erroneously lead to the 
appearance of a stronger or weaker association between the 
agent in question and disease than in fact exists. 

Bioassay: The determination of the activity of a sample of an 

agent by noting its effect on a live animal or an isolated 
organ preparation. 

Carcinogen: A substance or agent designated as capable of producing 
or initiating cancer. 

Carcinogen classification system: A system for stratifying the 

weight of evidence for human carcinogenicity, for example, 
the system followed by the EPA. The EPA system consists of 
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the following levels: Group A — carcinogenic to humans; 
Group B — probably carcinogenic to humans; Group C — possibly 
carcinogenic to humans; Group D — not classifiable as to human 
carcinogenicity; and Group E — evidence of non-carcinogenicity 
for humans. 1 


Case-control study: A type of epidemiologic study which compares 
diseased persons (cases) with nondiseased persons (controls) 
in association with a common exposure to an agent. 


Chronic: Persisting over a long period of time. Regarding animal 

studies, refers to administration of the test substance over 
a period of several weeks or months. 


Cohort study: An epidemiologic study which examines the development 
of a disease in a group (cohort) of persons who are currently 
free of the disease. May assess exposure either prospectively 
or retrospectively. 


Confounding: As applied to epidemiologic studies, the situation 

in which the relationship between an agent and a disease 
appears stronger or weaker than it truly is due to the 
influence of another unknown or unrecognized factor. In 



The definitions for carcinogen classification system, dose- 
response assessment, exposure assessment, hazard 
identification, risk assessment, risk characterization and 
weight of evidence are taken from the EPA’s 1986 "Guidelines 
for Carcinogen Risk Assessment," 51 Fed. Rea. 185, 33992-34003. 
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confounding, the agent under consideration is associated with 
another agent (a confounding factor, or confounder) which is 
itself associated with either an increase or decrease in the 
incidence of the disease. 

Dose-response assessment: Part of a risk assessment. Defines the 
relationship between the dose of an agent and the probability 
of induction of a carcinogenic effect. 

Environmental tobacco smoke (ETS): Consists of smoke originating 
from the smoldering end of a tobacco product between puffs, 
e.g., sidestream smoke, and of smoke exhaled by the smoker. 
The components are released into the environment where they 
are diluted by ambient air and undergo changes related to 
aging over time. 

Epidemiology: The branch of science concerned with the, patterns 

of disease in human populations and the various factors that 
influence these patterns. 

Exposure assessment: Part of a risk assessment. Identifies 

populations exposed to the agent, describes their composition 
and size, and presents the types, magnitudes, frequencies and 
durations of exposure to the agent. 
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Hazard identification: Part of a risk assessment. A qualitative 
assessment of risk, dealing with the process of determining 
whether exposure to an agent has the potential to increase 
the incidence of cancer. It qualitatively answers the question 
of how likely an agent is to be a human carcinogen. 

In vitro : Literally, within glass; used to refer to laboratory 

procedures conducted in a test tube or similar location, often 
involving preparations of cells or tissues. 

In vivo : Literally, within the living body; used to refer to 

laboratory procedures utilizing live animals. 

Mainstream smoke (MS): Tobacco smoke drawn through the butt end 
of a cigarette. 

Meta-analysis: A statistical technique for combining studies into 
a single analysis, designed to increase the ability to 
statistically detect an association if such an association is 
present. 

Mutagen: An agent that tends to increase the frequency or extent 

of mutation, i.e., physical or biochemical changes in the 
genetic material of an organism. 
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"Importance of the Indoor Environment in 


Binder, R., et al.. 

Air Pollution Exposure," Arch Environ H ealth 31(6): 277-279, 

1976. 

Pharmacokinetics: The study of the action of chemical substances 

in the body over a period of time, including the processes of 
absorption, distribution, metabolism and excretion. 

Relative risk: The ratio of the incidence rate of a disease among 
individuals exposed to a particular risk factor to the 
incidence rate among unexposed individuals. 

Risk assessment: The determination of adverse health consequences 
from exposure to toxic agents. [Will be carried out 
independently from considerations of the consequences of 
regulatory action.] Includes one or more of the following 
components: hazard identification, dose-response assessment, 

exposure assessment and risk characterization. 

Risk characterization: Part of a risk assessment. Combines the 

results of exposure assessment and dose-response assessment 
to estimate a carcinogenic risk in quantitative terms. 

Risk management: A combination of risk assessment with the 

directives of regulatory legislation, together with 
socioeconomic, technical, political and other considerations. 
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to reach a decision as to whether or how much to control future 
exposure to suspected toxic agents. 

Short-term tests: In vitro (performed on cells or tissue cultures) 
tests for mutations, including tests for chromosome 
aberrations, DNA damage/repair and other transformations which 
provide supportive evidence of cellular changes and may give 
information on carcinogenic mechanisms. 

Sidestream smoke (SS): Smoke originating from the smoldering end 
of a tobacco product between puffs. 

Statistical significance: A procedure to quantify the probability 
that an observed outcome, e.g., an association between an 
exposure and a disease endpoint, arose from random variation 
alone. The scientific community often uses 5% as a standard 
level at which data are accepted as occurring other than by 
chance. This means that there is a 95% probability that the 
results are not attributable to chance. 

Toxicology: The scientific study of poisons, their actions, their 

detection and the treatment of the conditions produced by them. 

Weight of evidence: A framework utilized by the EPA for judging 
the likelihood that an agent is a human carcinogen. Three 
major steps are involved: (1) characterization of evidence 
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from human studies and from animal studies, individually; (2) 
combination of the characterizations of these two types of 
data into an indication of the overall weight of evidence; 
and (3) evaluation of all supporting information to determine 
if the overall weight of evidence should be modified. [See 
also definition for carcinogen classification system.] 
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DEFINITIONS 


In medical research, there are two major types of studies: 
experimental studies and observational studies. 

An Experimental Study requires that the members of a 
study population be assigned to either a treatment or control 
group. The treated and untreated groups are then followed 
prospectively to see whether the two groups subsequently differ in 
their disease experience. 

Def: An Observational Study is one in which the 
treatment or exposure of interest is not assigned but instead occurs 
by choice or by happenstance. 

The types of observational studies are the case report, 
the cross-sectional study, the ecologic study, the case-control 
study, and the cohort study (often called prospective). 

A Case Report is strictly speaking not a scientific 
study but a description of a small number of persons with an unusual 
disease or an unusual change in their disease status. 

A Cross-sectional Study reports the characteristics 
of a group of people at one point in time or a snapshot of their 
health picture. 
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The Ecoloqic Study uses data that are routinely 
collected (such as air pollution data) to study the occurrence of 
disease among groups of people. For example, heart disease 
incidence may be studied in a group of people where information on 
national dietary habits are known. 

A Case-control Study or Retrospective Study is one 
that begins with study subjects who have the disease of interest 
and a comparison group without the disease. The previous exposures 
of both groups are investigated. 

A Cohort Study or Prospective Study is one in which 
the researcher starts with one group of persons exposed to a factor 
of interest and another comparable group that is unexposed. These 
groups are observed at a later time to see whether they have 
developed differences which might be attributable to their 
different exposures. 

A Confounder is a factor which confuses the correct 
interpretation of the data relating to a suspect and disease. The 
confounding factor acts by being associated both with the exposure 
and the disease in a way that makes the exposure and the disease 
seem to be related. An example, which was published in 1978, 
related jet plane noise with an increased death rate. Upon 
reexamination, it was found that persons exposed to jet noise lived 
in devalued housing close to airports and were of a less fortunate 
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socioeconomic strata. When a proper analysis of these other factors 
was performed, jet noise was found to have no association with 
increased mortality. 

Confounding is the process by which noncausal 
associations between two factors is produced by any association 
with a third factor known as the confounder. 

Bias is nonrandom error. Not related to the word bias 
used in the sense of prejudice. 

Risk (or absolute risk) is expressed as a death rate or 
disease rate. 

Relative Risk is the ratio or quotient of two risks or 
absolute risks. It is also known as a risk ratio. 

Odds Ratio is a measure of risk usually obtained from 
case-control studies and mathematically close to relative risk. 

p-value is a statistical estimate of the probability 
that a finding is due to chance. By convention, a finding with a 
p-value less than 5%, or sometimes 1%, is called statistically 
significant. 
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Statistical Association . Two factors are statistically 
associated when there is a tendency for the factors to occur 
together or to change together. The observed relationship is the 
statistical association, measured in many ways. 



Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 


2503012381 



5 


ro 

CJI 

o 

co 

o 

1-4 

ro 

co 

CP 

ro 


Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 



A DICTIONARY OF 

EPIDEMIOLOGY 

SECOND EDITION 


Edited for the 

International Epidemiological Association 

b 


John M. Last 




New York Oxford Toronto 
OXFORD UNIVERSITY PRESS 
1988 


1 


Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 




Oxford University Press 

Oxford New York Toronto 
Delhi Bombay Calcutta Madras Karachi 
PeiaJtogJaya Singapore Hong Kong Tokyo 
Nairobi Dar es Salaam Cape Town 
Melbourne Auckland 

and associated companies in 
Berlin Ibadan 

Copyright C 1988 by International Epidemiological Association, Inc. 

Published by Oxford University Press. Inc.. 

200 Madison Avenue. New York. New York 10016 

Oxford is a registered trademark of Oxford University Press 

All rights reserved. No pan of this publication may be reproduced, 
stored in a retrieval svstem, or transmitted, in any form or bv anv means, 
electronic, mechanical, photocopying, recording, or otherwise, 
without the prior permission of Oxford University Press. 

Library of Congress CatalogingHit* Publication Data 

A Dictionary of epidemiology, 

Includes bibliographies. 

I. Epidemiology—Dictionaries, I. Last. John M.. 1926- 
II. International Epidenuotogtcal Association. 

(DNLM; I. Epidemiology—dictionaries. 

WA 19 D553| 

RA651.D53 1988 614.40321 1741409 

ISBN 0-19*909480*6 
ISBN (I* 19*509481*4 (pbk.) 



Foreword 


The International Epidemiological Association is extremely 
pleased that the Dictionary of Epidemiology has been so successful 
that a second edition has been demanded. As one of the Asso- 
ciation's aims is to “spread the message,” this work is an exam¬ 
ple of “what we call it.” Only if we all understand the same 
thing when a particular term is used will the aim of the Asso¬ 
ciation be capable of being fulfilled. This dictionary is funda¬ 
mental to this objective. 


W. W. Holland, md frcgf frcp ffcm 
President, International Epidemiological Association 
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>4681097511 

Primed in the United State* of America 
on acid-free paper 
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This dictionary, appearing now in iis second edition, is an at¬ 
tempt to bring some order to the occasionally chaotic nomen¬ 
clature of epidemiology. It is intended for all who are inter¬ 
ested in epidemiology, especially those who arc beginning to 
study the subject, those whose first language is not English, and 
those from other fields who need to know the terms epide¬ 
miologists use. 

Like all rapidly expanding sciences, epidemiology has been 
confounded by the proliferation of words and phrases to de¬ 
scribe its concepts, principles, methods, and procedures. The 
creation of new terms and disagreement about the meaning of 
old ones can confuse beginners and established epidemiologists 
alike. 

Remarks by users of the first edition have reinforced the view 
that the boundaries should be wide rather than narrow, that 
the language should be simple, that some terms many epide¬ 
miologists think everyone already knows should be included. 
The second edition is larger than the first, partly for this rea¬ 
son, and because terms omitted from the first edition have been 
included and many old entries expanded. 

The dictionary is not an index of permitted and proscribed 
usage. I hope that it is authoritative without being authoritar¬ 
ian. Where synonyms exist, the definition appears under the 
most commonly used of these, but preference for one term over 
another is not necessarily implied. In a few instances, the use 
of a term is deprecated. Some terms that are properly de¬ 
scribed as slang or jargon have been included because they are 
widely used and their meaning is not always clear from the con¬ 
text. Murphy's description of jargon is worth recalling: "ob¬ 
scure and/or pretentious language, circumlocutions, invented 
meanings, and pomposity delighted in for its own sake." 

There was disagreement among the contributors to this edi¬ 
tion about including certain acronyms and eponyms. An acro¬ 
nym is a word made up of letters from two or more other words, 
e.g. ANOVA for analysis of variance, or from initial letters, c.g. 
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WHO for World Health Organization. All lay and technical vo¬ 
cabularies contain acronyms; epidemiology has its fair share. 
By convention, acronyms are spelt out the first time they ap¬ 
pear in a text, and, if they are numerous, considerate editors 
sometimes supply a glossary* or at least list the acronyms along 
with the words for which they stand in an index. Although this 
dictionary is not the place for extensive mention of acronyms, 
a few appeared in the first edition, and a few more appear 
here. 

E pony ms, the attachment of personal or place names to con¬ 
cepts, diseases, methods or specific studies, also occur often 
enough in published papers and books for us to recognize that 
beginners need some guidance to the meaning of those most 
widely used. Some appeared in the first edition, and a few have 
been added to the second—though again this dictionary is not 
the proper place for a full glossary of epidemiological eponyms 
(where would such a glossary end!). 

As was the case with the first edition, a large number of epi¬ 
demiologists from many countries have participated in this re¬ 
vision. The original modest notices in a couple of journals and 
a few casual remarks among friends produced a mailing list of 
some forty persons, mainly in North America and the United 
Kingdom. The mailing list rapidly grew until, by the fifth round 
of correspondence in December 1986, there were 108 corre¬ 
spondents in 25 countries. The list continued to grow after this 
fifth and final round; but the published rosier of names that 
follows this preface is both more and less than the number of 
active participants. Some seemingly inquired just from curiosity 
and played no further part. Others wrote lengthy and often 
vigorously argumentative comments and suggestions express¬ 
ing not only their own views but those of colleagues in their 
academic department or institution—in one instance, col¬ 
leagues elsewhere in that nation. 

In addition to extensive comments from these correspon¬ 
dents, 1 have made good use of other technical dictionaries and 
glossaries in compiling this revision. All of these are listed in 
the bibliography, and many are also to be found in footnotes 
that follow specific entries. 

The compilers of dictionaries must exercise the greatest care 
in the choice of words and in their arrangement. Most entries 
in this dictionary have been repeatedly discussed with many 
contributors, and in nearly all instances the wording has been 
agreed upon by all; on the rare occasions when agreement eluded 
us, the final decision was mine alone. Therefore, I accept full 
responsibility for the deficiencies in the finished product. 

The work has been sponsored by the International Epide¬ 


preface 


miological Association, which provided partial travel support 
for me to attend two meetings in 1986; further support was 
provided by the McLean Foundation and the Milbank Memo¬ 
rial Fund. All royalties from the sale of this edition, like those 
from the first edition, will go to the International Epidemiol¬ 
ogical Association. 

Finally, I thank Jeffrey House of Oxford University Press for 
helpful advice and encouragement. 


Ottawa , Canada J- M. L. 

November 1987 
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abortion rate The estimated annual number of abortions per 1000 women of repro¬ 
ductive age (usually defined as age 15-44). 

abortion ratio The estimated number of abortions per 100 live births in a given year. 
abscissa The distance along the horizontal coordinate or x axis, of a point P from the 
vertical or y axis of a graph. See also axis, graph, ordinate, 
absolute risk Usually this term means the observed or calculated risk of an event in 
a population under study, as contrasted with the relative risk. Sometimes, however, 
it is a synonym for attributable fraction, excess risk, or risk difference; because of 
the inconsistency, this term should be avoided. See also risk, 
acceptable risk The risk that has minimal detrimental effects, or for which the ben¬ 
efits outweigh the potential hazards. Epidemiologic study has provided data for 
calculation or risks associated with many medical procedures and also with occupa¬ 
tional and environmental exposures; these data are used, for instance, in clinical 

DECISION ANALYSIS. 

accuracy The degree to which a measurement, or an estimate based on measure¬ 
ments, represents the true value of the attribute that is being measured. See also 
measurement, problems with terminology, 
acquaintance network Group of persons in contact or communication among whom 
transmission of an infectious agent and of knowledge, attitudes, and values is pos¬ 
sible, and whose social interaction may have health implications. See also transmis¬ 
sion of infection. 

acquired immunodeficiency byndrome (Syn: acquired immune deficiency syndrome) 
(AIDS) For surveillance purposes, the Centers for Disease Control, Atlanta, Geor¬ 
gia, 1 define a case of AIDS as an illness characterized by (I) one or more of a group 
of opportunistic or indicator diseases that are indicative of underlying cellular im¬ 
munodeficiency; (2) absence of all known underlying causes of cellular immuno¬ 
deficiency and absence of all other causes of reduced resistance to opportunistic or 
indicator diseases. Additional criteria are serum positive for HIV antibody, positive 
culture for HIV, and reduction ofT4 “helper” lymphocytes. 

The opportunistic or indicator diseases associated with AIDS include certain pro¬ 
tozoal and helminth infections, notably Pnrumocpiis corinii pneumonia and toxo¬ 
plasmosis; fungal infections, notably candidiasis of esophagus, trachea, bronchi or 
lungs and cryptococcosis, especially affecting the central nervous system; bacterial 
infections, notably with certain mycobacteria; viral infections, notably cytomegalo¬ 
virus and herpes simplex; and cancer, notably Kaposi's sarcoma and lymphoma 
limited to the brain. 

AIDS-related complex (ARC) is the combination of HIV positive test with lymph* 
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adenopathy and persistent low fever but without immunodeficiency or opportunis¬ 
tic diseases. 

1 1987 Revision of case definition of AIDS for surveillance purposes. MMWR 36, 1S:4S-9S, 1987. 

ACTIVITIES OP DAILY living (adl) SCALE A scale devised by Katz and others' to score 
physical afhiltty/dtsability; used to measure outcomes of interventions for various 
chronic disabling conditions such as arthritis. The scale is based on scores for re¬ 
sponses to questions about mobility, self-care, grooming, etc. This was the first widely 
used scale of this type; others, mostly refinements or variations of the ADL scale, 
have since been developed. 

'Katz S, Ford. AB, Moskowiu, RW. Jackson, BA. Jaffe, MW: Studies of illness in the aged. The 
index of ADL a standardized measure of biological function. JAMA 185:914-919. 1963. 

ACTUARIAL RATE See FORCE OF MORTALITV. 

ACTUARIAL TABLE See LIFE TABLE. 

ACUTE 

1. Referring to a health effect, brief; sometimes loosely used to mean severe. 

2. Referring to exposure, brief, intense, or short-term; sometimes specifically re¬ 
ferring to brief exposure of high intensity. See also chronic. 

adaptation A heritable component of the phenotype which confers an advantage in , 

survival and reproductive success. The process by which organisms adapt to envi¬ 
ronmental conditions. x 

additive model A model in which the combined effect of several factors is the sum of 
v the effects that would be produced by each of the factors in the absence of the 
others. For example, if factor A* adds xtf to risk in the absence of )\ and if factor V 
adds to risk in. the absence of A, an additive model states that the two factors 
together will add (jr+yjtf to risk. Sec also interaction; linear model; mathemat¬ 
ical model; multiplicative model. 

adjustment A summarizing procedure for a statistical measure in which the effects of 
differences in composition of the populations being compared have been mini¬ 
mized by statistical methods. Examples are adjustment by regression analysis and 
by standardization. Adjustment often is performed on rates or relative risks, com¬ 
monly because of differing age distributions in populations that are being com¬ 
pared. The mathematical procedure commonly used to adjust rales for age differ¬ 
ences is direct or indirect standardization. 

adverse reaction, sxde eftect Any undesirable or unwanted consequence of a pre¬ 
ventive, diagnostic, or therapeutic procedure. 

AETIOLOGY, AETIOLOGIC See ETIOLOGY, ETIOLOGIC. 

ACE DEPENDENCY RATIO See DEPENDENCY RATIO. 

agent (of disease) A factor, such as a microorganism, chemical substance, or form of fO 

radiation, whose presence, excessive presence, or (in deficiency diseases) relative 
absence is essential for the occurrence of a disease. A disease may have a single 
agent, a number of independent alternative agents (at least one of which must be O 

present), or a complex of two or more factors whose combined presence is essential 
for the development of the disease. Sec also causality; necessary and sufficient CO 
cause. 

age-period cohort analysis See cohort analysis. ^ 

age-sex pyramid See population pyramid. 

age-sex register List of all clients or patients of a medical practice or service, classi¬ 
fied by age (binhdate) and sex; provides denominator for calculating age- and sex- 
specific rales. 

AGE-sPEcinc fertility rate The number of births occurring during a specified pe- 
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riod to women of a specified age group, divided by the number of person-years 
lived during that period by women of that age group. When an age-specific fertility 
rate is calculated for a calendar year, the number of births to women of the speci¬ 
fied age is usually divided by the midyear population of women of that age. 
age-specific rate A rate for a specified age group. The numerator and denominator 
refer to the same age group. 

Example: 

Number of deaths among residents 
Age-specific death m age 25-34 in an area in a year 
rate (age 25-34) Average (or midyear) population 
age 25-34 in the area in that year 

The multiplier (usually 100,000 or 1,000,000) is chosen to produce a rate that can 
be expressed as a convenient number. 

age standardization A procedure for adjusting rates, e.g. death rates, designed to 
minimize the effects of differences in age composition when compring rates for 
different populations. See also adjustment, standardization, 
aggregation bias (Syn: ecological bias) See ecological fallacy, 
aging of the population A demographic term, meaning an increase over time in the 
proportion of older persons in the population. It does not necessarily imply an 
increase in life expectancy or that "people are living longer than they used to." The 
principal determinant of aging in the population has been a decline in the birth 
rate; when fewer children are bom than in prior years, the result, in the absence 
of a rise in the death rate at higher ages, has been an increase in the proportion of 
older persons in the population. In developed societies, however, mortality change 
is becoming a factor: little further mortality reduction can occur in the first half of 
life, to reductions are beginning to occur in the third and fourth quarters of life, 
leading to a rise in the proportion of older persons from this cause. 
airborne infection A mechanism of transmission of an infectious agent by panicles, 
dust, or droplet nuclei suspended in the air. See also transmission or infection, 
algorithm Any systematic process that consists of an ordered sequence of steps with 
each step depending on the outcome of the previous one. The term is commonly 
used to describe a structured process, for instance, relating to computer program¬ 
ming or to health planning. See also decision tree, 
algorithm, clinical (Syn: clinical protocol) An explicit description of steps to be taken 
in patient care in specified circumstances. This approach makes use of branching 
logic and of all pertinent data, both about the patient and from epidemiologic and 
other sources, to arrive at decisions that yield maximum benefit and minimum risk. 
allele Alternative forms of a gene, occupying the same locus on a chromosome. 
ALPHA ERROR See ERROR, TYPE I. 

ALPHA LEVEL Set SIGNIFICANCE LEVEL. 

analysis of variance A statistical technique that isolates and assesses the contribution 
of categorical independent variables to variation in the mean of a continuous de¬ 
pendent variable. The observations are classified according to their categories for 
each of the independent variables, and the differences between the categories in 
their mean values on the dependent variable are estimated and tested for statistical 
significance. 

analytic study A study designed to examine associations, commonly putative or hy« 
yf pothered causal relationships. An analytic study is usually concerned with idenii- 


Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 
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Tying or measuring the eTfects of risk factors, or is concerned with the health effects 
of specific exposure(s). Contrast descriptive study, which does not test hypotheses. 
The common types of analytic study are cross-sectional, cohort, and case-con¬ 
trol. In an analytic study, individuals in the study population may be classified 
according to absence or presence (or future development) of specific disease and 
according to “attributes" that may influence disease occurrence. Attributes may in¬ 
clude age, race, sex, other disease(s), genetic, biochemical, and physiological char¬ 
acteristics, economic status, occupation, residence, and various aspects of the envi¬ 
ronment or personal behavior. See also case control study; cohort study; cross- 
sectional study; study design. 

animal model Study in a population of laboratory animals that uses conditions of an- 
. imals analogous to conditions of humans to model processes comparable to those 
that occur in human populations. See also experimental epidemiology. 

antagonism Opposite of synergism. The situation in which the combined effect of two 
or more factors is smaller than the solitary effect of any one of the factors. In 
bioassay, the term may be used to refer to the situation when a specified response 
is produced by exposure to either of two factors but not by exposure to both to- 
gether. 

anthropometry The technique that deals with the measurement of the size, weight, 
and proportions of the human body. 

anthropophiuc (adj.) Pertaining to an insert’s preference for feeding on humans even 
when nonhuman hosts are available. 

antibody Protein molecule formed by exposure to a “foreign" or extraneous substance, 
e.g., invading microorganisms responsible for infection, or active immunization. May 
also be present as a result of passive transfer from mother to infant, via immune 
globulin, etc. Antibody has the capacity to bind specifically to the foreign substance 
(antigen) that elicited its production, thus supplying a mechanism for protection 
against infectious diseases. Antibody is epidcmiologtcally important because its con¬ 
centration (titer) can be measured in individuals, and, therefore, in populations. 
See also seroepidemiology. 

antigen A substance (protein, polysaccharide, glycolipid, tissue transplant, etc.) that is 
capable of inducing specific immune response. Introduction of antigen may be by 
the invasion of infectious organisms, immunization, inhalation, ingestion, etc. 

antigenic drift This term describes the “evolutionary** changes that take place in the 
molecular structure of DNA/RNA in micro-organisms during their passage from 
one host to another. It may be due to recombination, deletion or insertion of genes, 
to point mutations, or to several of these events. This process has been studied in 
common viruses, notably the influenza virus. 1 It leads to alteration (usually slow 
and progressive) in the antigenic composition, and thus in the immunologic re¬ 
sponses of individuals and populations to exposure to the micro-organisms con¬ 
cerned. See also antigenic shift. 

"Palest P, Young JF; Variation of Influenza A, B, and C Viruses. Science 215:1468-1473, 1982. 

antigenic shift This term describes mutation, i.e., a sudden change in molecular 
structure of DNA/RNA in micro-organisms, especially viruses, which produces new 
strains of the micro-organism. Hosts previously exposed to other strains have little 
or no acquired immunity. Antigenic shift is believed to be the explanation for the 
occurrence of strains of the influenza A virus associated with large-scale epidemic 
and pandemic spread. Antigenic shift is responsible for the susceptibility of host 
populations to a new strain of influenza virus. See also antigenic drift. 

antigenicity (Syn: immunogen icily) The ability of agent(s) to produce a systemic or a 
local immunologic reaction in the host. 



♦i 

41 


IN) 

tsi 



*<Q 

N 


association, asymmetrical 

arbovirus A group of taxonomically diverse animal viruses that are unified by an ep¬ 
idemiologic concept, i.e., transmission between vertebrate host organisms by blood¬ 
feeding (hematophagous) arthropod vectors such as mosquitoes, ticks, sand flies, 
and midges. The term is a contraction of arthropod-borne virus. 

The interaction of arbovirus, vertebrate host(s), and arthropod vector gives this 
class of infections several unique epidemiologic features. See vector-rorne infec¬ 
tion for definition of terms used to describe these features. 

area sampling A method of sampling that can be used when the numbers in the pop¬ 
ulation are unknown. The total area to be sampled is divided into subareas, e.g., by 
means of a grid that produces squares on a map; these subareas are then numbered 
and sampled, using a table of random numbers. Depending upon circumstances, 
the population in the sampled areas may first be enumerated, then a second stage 
of sampling may be conducted. 

arithmetic mean The sum of all the values in a set of measurements, divided by the 
number of values in the set. 

artificial intelligence A branch of computer science in which attempts are made to 

V duplicate human intellectual functions. One application is in diagnosis, in which 
computer programs are often based upon epidemiologic analyses of data in hospital 
charts or other clinical records, 

ascertainment The process of determining what is happening in a population or study 
group, e.g., family and household composition, occurrence of cases of specific dis¬ 
eases; the latter is also known at case-finding. 

ascertainment ria$ Systematic failure to represent equally all classes of cases or per¬ 
sons supposed to be represented in a sample. This bias may arise because of the 
nature of the sources from which persons come, e.g., a specialized clinic; from a 
diagnostic process influenced by culture, custom, or idiosyncracy; or, for example, 
in genetic studies, from the statistical chance of selecting from large or small fami¬ 
lies. 

assay The quantitative or qualitative evaluation of a hazardous substance; the results 
of such an evaluation. 

association (Syn: correlation, (statistical] dependence, relationship) Statistical depen¬ 
dence between two or more events, characteristics, or other variables. An associa¬ 
tion is present if the probability of occurrence of an event or characteristic, or the 
quantity of a variable, depends upon the occurrence of one or more other events, 
the presence of one or more other characteristics, or the quantity of one or more 
other variables. The association between two variables is described as positive when 
the occurrence of higher values of a variable is associated with the occurrence of 
higher values of another variable. In a negative association, the occurrence of higher 
values of one variable is associated with lower values of the other variable. An as¬ 
sociation may be fortuitous or may be produced by various other circumstances; 
the presence of an association does not necessarily imply a causal relationship. If 
the use of the term “association** b confined to situations in which the relationship 
between two variables is statbtically significant, the terms “statistical association" and 
“statistically significant association" become tautological. However, ordinary usage 
b seldom so precise as thb. The terms “association" and “relationship" are often 
used interchangeably. 

Associations can be broadly grouped under two headings, symmetrical or non- 
causal (see below) and asymmetrical or causal. 

association, asymmetrical (Syn: asymmetrical relationship) The definitive conditions 
of asymmetrical associations are direction and time. Independent variable X must 
cause changes in dependent variable Y, and the “causal" variahle romi i*. 


Source; https://www.industryft|(jcuments.ucsf.edu/docs/lnbj0000 
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"effects." Bradford Hill 1 and others** 9 have pointed out that the (subjective) likeli¬ 
hood or a causal relationship is increased by the presence of the following attri¬ 
butes. However, temporality is the only indispensable condition among these. 

1. Consistency—The association is consistent if the results are replicated when 
studied in different settings and by different methods. 

2. Strength—This is an expression of the disparity between the frequency with 
which a factor is found in the disease and the frequency with which it occurs 
in the absence of the disease. Not to be confused with statistical significance. 

3. Specificity—This is established with the limitation of the association to a single 
putative cause and single effect. 

4. Dose-response relationship—This is established when an increased risk or sc- 
veritv in disease occurs with an increased quantity (“dose") or duration of ex¬ 
posure to a factor. 

5. Temporality—The exposure to a putative cause always precedes, never fol¬ 
lows. the outcome. 

6. Biological plausibility—It is desirable that the association agree with current 
understanding of the response of celts, tissues, organs, and systems to stimuli. 
This criterion should not be applied rigidly. The association may be new to 
science or medicine. As Sherlock Holmes advised Dr. Watson, "When you have 
eliminated the impossible, whatever remains, however improbable, must be 
the truth." 

7. Coherence—The associations should not conflict with the generally known facts 
of the natural history and biology of disease. 

8. Experiment—It is sometimes possible to appeal to experimental, or quast- 
ex peri mental evidence, e.g., an observed association leads to some preventive 
action. Does this action in fact prevent? 

See also causality: evans’s postulates; koch’s postulates. 

1 Bradford Hill A: The environment and disease: Association or causation. Proc Rot Sac Med 58:295- 
m, 1965. 

’Susser MW: Judgment and causal inference. Am J Epidemiol 105:1-15, 1977. 

’Rothman KJ (Ed): Causal Inference. Chestnut Hill, MA: Epidemiology Resources Inc., 1988, 
association, direct Directly associated, i.e., not via a known third variable: A-*8. Re¬ 
fers only to causality. 

association, indirect causal Two types are distinguished: 

I. Association of a factor C with disease A only because both are related to a 
common underlying factor 8. 


Alteration of factor C will not produce an alteration in the frequency to dis¬ 
ease A unless an alteration in C affects 8. It has been suggested that to avoid 
confusion with the alternative meaning of indirect association , this type should 
be called “secondary association." 

2. Association of a factor C with disease A by means of an intermediate or inter¬ 
vening factor 8. 


Alteration of factor C would produce an alteration in the frequency of dis¬ 
ease A. To avoid confusion, this type should be called “indirect causal asso¬ 
ciation." 
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association, spurious A term, preferably avoided, used with different meanings by 
different authors. It may refer to anifactual, fortuitous, false secondary, or to all 
kinds of noncausal associations due to chance, bias, failure to control for extraneous 
variables, etc. 

association, symmetrical An association is noncausal if it is symmetrical, as in the 
statement F*AM (force equals mass times acceleration). This is a noncausal, non- 
directional expression of the mathematical relationship between the physical prop¬ 
erties of force, mass, and velocity. If one side of the equation is changed, then the 
other must also change to maintain equilibrium. 

Although epidemiologists are usually most interested in asymmetrical statements 
that have direction, the symmetrical equation can be useful. For instance, preva¬ 
lence can be expressed in terms of incidence and duration in the simple equation, 
8«/xD. If two of these three elements are known, the third can be derived. Sec 
also SYMMETRICAL RELATIONSHIP. 

assortAT i vc matinc Selection of a mate with preference (or aversion) for a particular 
genotype, i.e., nonrandom mating. 

ASYMMETRICAL ASSOCIATION See ASSOCIATION, ASYMMETRICAL. 

asymptotic Pertaining to a limiting value, for example, of a dependent variable, when 
the independent variable approaches zero or infinity. See large sample method. 

asymptotic method See large sample method. 

attack rate Attack rate, or case rate, is a cumulative incidence rate often used for 
particular groups, observed for limited periods and under special circumstances, as 
in an epidemic. 

The secondary attack rate is the number of cases among contacts occurring within 
the accepted incubation period following exposure to a primary case, in relation to 
the total of exposed contacts; the denominator may be restricted to susceptible con¬ 
tacts when determinable. 

Infection rate is the incidence of manifest plus inapparent infections, which can be 
identified, e.g., by seroepidemiolocy. 

attributable fraction (af) (Syn: attributable proportion) A term sometimes used to 
refer to the attributable fraction in the population, and sometimes to the attribut¬ 
able fraction among the exposed. See also attributable fraction (exposed); at¬ 
tributable fraction (population). 

attributable fraction (exposed) (Syn: attributable proportion (exposed), attribut¬ 
able risk, etiologic fraction [exposed]). With a given outcome, exposure factor and 
population, the attributable fraction among the exposed is the proportion by which 
the incidence rate of the outcome among those exposed would be reduced if the 
exposure were eliminated. It may be estimated by the formula 



where / c is the incidence rate among the exposed, ! u is the incidence rate among 
the unexposed; or by the formula 


AF ’ m ~rn~ 


where RR is the rate ratio, !«//„. It is assumed that causes other than the one under 
investigation have had equal effects on the exposed and unexposed groups. 
attributable fraction (population) (Syn: attributable proportion [population), eti- 
ologic fraction (population], attributable risk). With a given outcome, exposure fac¬ 
tor, and population, the attributable fraction among the population is the propor- 
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lion b y which the incidence rate of the outcome in the entire population would be 
reduced if exposure were eliminated. It may be estimated by the formula 

where ! p is the incidence rate in the total population and /« is the incidence rate 
among the unexposed; or by the formula 

1 +p t w -i) 

where HR is the rate ratio* !JI P . it is assumed that causes other than the one under 
investigation have had equal effects on the exposed and unexposed groups. 
attributable number The number of new occurrences of a specific outcome attrib¬ 
utable to an exposure; it may be estimated using the formula 


AN 
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where /, is the incidence rate among the exposed, /» is the incidence rate among 
the unexposed, and N t is the number of persons in the exposed population, It is 
assumed that causes other than the one under investigation have had equal effects 
on the exposed and unexposed groups. 

attributable risk The rate of a disease or other outcome in exposed individuals that 
can be attributed to the exposure. This measure is derived by subtracting the rale 
of the outcome (usually incidence or mortality) among the unexposed from the rate 
’ among the exposed individuals; it is assumed that causes other than the one under 
investigation have had equal effects on the exposed and unexposed groups. Unfor¬ 
tunately, this term has been used to denote a number of different concepts, includ¬ 
ing the attributable fraction in the population* the attributable fraction among the 
exposed, the population excess rate, and the rate difference. Therefore* it should 
be defined carefully by all who use it. See also attributable fraction (exposed); 

POPULATION EXCESS RATE; ATTRIBUTABLE FRACTION (POPULATION); POPULATION AT<* 
TRI BUT ABLE RISK; RATE DIFFERENCE. 

attributable risk (exposed) This term has been used with different connotations to 
denote the attributable fraction among the exposed and the excess risk among the 
exposed. See also attributable fraction (exposed); rate difference, 
attributable risk (POPULATION) This term has been used with different connotations 
to denote the attributable fraction in the population and the population excess risk. 
See also attributable fraction (population); population excess rate, 
attributable risk percent Attributable fraction expressed as a percentage rather 
than as a proportion. 

attributable risk percent (exposed) This is the attributable fraction among the ex¬ 
posed, expressed as a percentage. See also attributable fraction (exposed), 
attributable risk percent (population) This is (he attributable fraction in the pop¬ 
ulation, expressed as a percentage. See also attributable fraction (population), 
attbibute A qualitative characteristic of an individual or item. 
audit An examination or review that establishes the extent to which a condition, pro¬ 
cess, or performance conforms to predetermined standards or criteria. 
autopsy data Data derived from autopsied deaths, e.g., for study of natural history of 
disease and trends in frequency of disease. Autopsies are done on nonrandomly 
selected persons in the population and findings should therefore be generalized 
only with great caution. 
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average Kendall and Buckland’s Dictionary of Statistical Terns (4th Edition, 1982) has 
this to say: M A familiar but elusive concept. Generally an 'average' value purports 
to represent or to summarize the relevant features of a set of values; and in this 
sense the term would include the median and the mode. In a more limited sense 
an 'average* compounds all the values of the set, e.g., in the case of the arithmetic 
or geometric means. In ordinary usage, 'the average' is often understood to refer 
to the arithmetic mean." See also measures of central tendency. 

AVERAGE LIFE EXPECTANCY See EXPECTATION OF LIFE. 


AXIS 


1. One of the dimensions of a graph. A two-dimensional graph has two axes, the 
horizontal or x axis, and the vertical or y axis. Mathematically, there may be 
more than two axes, and graphs are sometimes drawn with a third dimension; 
the eye cannot comprehend more than three dimensions. 

2. In nosology, an axis of classification is the conceptual framework, e.g., etio- 
logic, topographic, psychologic, sociology. The International Classification of 
Disease, for example, is multiaxial; the primary axis is topographic (i.e., body 
systems); secondary axes relate to etiology* manifestations of disease, detail of 
sites affected, severity, etc. 


Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 
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background level, rate The concentration, often low, at which some substance, agent, 
or event is present or occurs at a particular time and place in the absence of a 
specific hazard or set of hazards under investigation. An example is the background 
level of naturally occurring forms of ionizing radiation to which we are all exposed. 

BAR diagram A graphic technique for presenting discrete data organized in such a 
way that each observation can fall into one and only one category of the variable. 
Frequencies are listed along one axis and categories of the variable along the other 
axis. The frequencies of each group of observations are represented by the lengths 
of the corresponding bars. See also histogram. 
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P(S\D)P(D)+P(S\D)P(D) 

where 0* disease, symptom, and 0«no disease. The formula emphasizes what 
clinical intuition often overlooks, namely, that the probability of disease given this 
symptom depends not only on how characteristic that symptom is of the disease but 
also on how frequent the disease is among the population being served. “If you 
hear hoof beats in the street, do not look for zebra.’ 1 

The theorem can also be used for estimating exposure-specific rates from case 
control studies if there is added information about the overall rate of disease in that 
population. 

Some of the terms in the theorem have special names. The probability of disease 
given the symptom is called the M posterior probability.” It is an estimate of the 
probability of disease posterior to knowing whether or not the symptom was pres* 
ent. The overall probability of disease among the population or our guess of the 
probability of disease before knowing of the presence or absence of the symptom 
is called the M prior probability.” The theorem is sometimes presented in terms of 
the odds of disease before knowing the symptom (prior odds) and after knowing 
the svmptom (posterior odds). 

behavioral epidemic An epidemic originating in behavioral patterns (as opposed to 
invading microorganisms or physical agents). Examples include the dancing manias 
of the Middle Ages, episodes of mass fainting or convulsions (“hysterical epidem¬ 
ics”), crowd panic, or waves of fashion or enthusiasm. The communicable nature of 
the behavior is dependent not only on person-to-person transmission of the behav¬ 
ioral pattern but also on group reinforcement (as with smoking, alcohol, or drug 
use), behavioral epidemics may be difficult to differentiate from, or may compli¬ 
cate, outbreaks of organic disease, for example, due to contamination of the envi¬ 
ronment by a toxic substance, 

behavioral risk factor A characteristic or behavior that is associated with increased 
probability of a specified outcome; the term does not imply a causal relationship. 

benchmark A slang or jargon term, usually meaning a measurement taken at the out¬ 
set of a series of measurements of the same variable, sometimes meaning the best 
or most desirable value of the variable. Because of uncertainty about meaning, the 
term should not be used. 

benefit—cost ratio The ratio of net present value of measurable benefits to costs. 
Calculation of a benefit-cost ratio is used to determine the economic feasibility or 
success of a program. 

Bernoulli distribution The probability distribution associated with two mutually ex¬ 
clusive and exhaustive outcomes, e.g., death or survival; a Bernoulli variable is one 


Bar diagram. From Susser, Watson, Hopper, 1985. 

saves* theorem A theorem in probability theory named for Thomas Bayes (1702- 
1761), an English clergyman and mathematician; his Essay Towards Solving a Problem 
in the Doctrine of Chances (1763, published posthumously), contained this theorem. 
In epidemiology, it is used to obtain the probability of disease in a group of people 
with some characteristic on the basis of the overall rale of that disease (the prior 
probability of disease) and of the likelihoods of that characteristic in healthy and 
diseased individuals. The most familiar application is in clinical decision analysis 
where it is used for estimating the probability of a particular diagnosis given the 
appearance of some symptoms or lest result. A simplified version of the theorem is 


that has only two possible values, e.g., death or survival. See also binomial distri- 

BUTION. 

BERKSON f S BUS See BIAS, SELECTION. 

beta error See error, type II. 

bias Deviation of results or inferences from the truth, or processes leading to such 
deviation. Any trend in the collection, analysis, interpretation, publication, or re- 
view of data that can lead to conclusions that are systematically different from the 
truth. Among the ways in which deviation from the truth can occur, are the follow¬ 
ing: 

I. Systematic (one-sided) variation of measurements from the true values (syn: 
systematic error). 


Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 
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2. Variation of statistical summary measures (means* rates, measures of associa¬ 
tion. etc.) from their true values as a result of systematic variation of measure¬ 
ments, other flaws in data collection, or flaws in study design or analysis. 

3. Deviation of inferences from the truth as a result of flaws in study design, 
data collection* or the analysis or interpretation of results. 

4. A tendency of procedures (in study design* data collection, analysis* interpre¬ 
tation, review or publication) to yield results or conclusions that depart from 
the truth. 

5. Prejudice leading to the conscious or unconscious selection of study proce¬ 
dures that depart from the truth in a particular direction, or to one-sidedness 
in the interpretation of results. 

The term “bias" does not necessarily carry an imputation of prejudice or other 
.subjective factor, such as the experimenter's desire for a particular outcome. This 
differs from conventional usage in whkh bias refers to a partisan point of view. 

Many varieties of bias have been described. 1 

’Sacked DL: Bias in analytic research. / Chron Dis 32:51*63,1979. 

MAS, ascertainment Systematic error, arising from the kind of individuals or patients 
(e.g., slightly ill, moderately ill, acutely ill) that the individual observer is seeing. 
Also systematic error arising from the diagnostic process (which may be determined 
by the culture, customs, or individual idiosyncrasy of the person providing care for 
the patient). 

mas, in assumption (Syn: conceptual bias) Error arising from faulty logic or premises 
or mistaken beliefs on the part of the investigator. False conclusions about the ex¬ 
planation for associations between variables. Example: Having correctly deduced 
the mode of transmission of cholera, John Snow concluded that yellow fever was 
transmitted by similar means. In fact, the “miasma" theory would better fit the facts 
of yellow fever transmission. 

mas in autopsy semes Systematic error resulting from the fact that autopsies repre¬ 
sent a nonrandom sample of all deaths. 

mas, serkson’s See mas, selection. 

MAS DUE TO GONPOUNDINC See CONFOUNDING. 

mas, design The difference between a true value and that actually obtained, occurring 
as a result of faulty design of a study. Some examples are (I) uncontrolled studies 
where the effects of two processes cannot be separated (confounding), (2) con¬ 
trolled studies where observations are based on a poorly defined population, and 
(3) nonsimultaneous comparisons, e.g.. use of historical controls. 

mas, detection Due to systematic crror(s) in methods of ascertainment, diagnosis, or 
verification of cases in an epidemiologic survey, study, or investigation. Example: 
Verification of diagnosis by laboratory tests in hospital cases, but failure to apply 
the same tests to cases outside the hospital. 

bias due to digit preference See digit preference. 

bias in handling outsjers Error arising from a failure to discard an unusual value 
occurring in a small sample, or due to exclusion of unusual values that should be 
included. 

sias, information (Syn: observational bias) A Raw in measuring exposure or outcome 
that results in differential quality (accuracy) of information between compared groups. 

has due to instrumental brror Systematic error due to faulty calibration, inaccur¬ 
ate measuring instruments, contaminated reagents* incorrect dilution or muting of 
reagents, etc. 

iias or interpretation Error arising from inference and speculation. Sources of the 
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error include (I) failure of the investigator ko consider every interpretation consis¬ 
tent with the facts and to assess the credentials of each, and (2) mishandling of 
cases that constitute exceptions to some general conclusion. 

mas, interviewer Systematic error due to interviewers’ subconscious or even con¬ 
scious gathering of selective data. 

mas, •‘lead-time” A systematic error arising when follow-up of two groups does not 
begin at strictly comparable times. Occurs especially when one group has been di¬ 
agnosed earlier in the natural history of the disease than the other group. See also 

ZERO TIME SHIFT. 

mas, length A systematic error due to the selection of a disproportionate number of 
long-duration cases (cases who survive longest) in one group and not in the other. 
Can occur when prevalent cases, rather than incident cases, arc included in a case 
control study. 

sias, measurement Systematic error arising from inaccurate measurement (or classifi¬ 
cation) of subjects on the study variables. 

mas, observer Systematic difference between a true value and that actually observed 
due to observer variation. Observer variation may be due to differences among 
observers (imerobserver variation) or to variation in readings by the same observer 
on separate occasions (intraobserver variation). See also observer variation. 

bias in the presentation of data Error due to irregularities produced by digit pref¬ 
erence, incomplete data, poor techniques of measurement, or technically poor lab¬ 
oratory standards. 

bias in publication An editorial predilection for publishing particular findings, e.g., 
positive results, which leads to the failure of authors to submit negative findings for 
publication or failure of journal editors to accept and publish reports with negative 
findings. This can distort the general belief about what has been demonstrated in a 
particular situation. 

mas or an estimator The difference between the expected value of an estimator of a 
parameter and the true value of this parameter. See also unbiassed estimator. 

rias, recall Systematic error due to differences in accuracy or completeness of recall 
to memory of prior events or experiences. Example: Mothers whose children have 
had or have died of leukemia are more likely than mothers of healthy living chil¬ 
dren to remember details of diagnostic x-ray examinations to which these children 
were exposed in utero. 

mas, reporting Selective suppression or revealing of information such as past history 
of sexually transmitted disease. 

bias, response Systematic error .due to difference in characteristics between those who 
choose or volunteer to participate in a study and those who do not. 

mas, sampling Unless the sampling method ensures that all members of the “universe” 
or reference population have a known chance of selection in the sample, bias is 
possible. The best way lo ensure a known chance of selection for all is to use a 
probability sampling method such as a table of random numbers. 

mas, selection Error due to systematic differences in characteristics between those 
who are selected for study and those who are not. Examples include hospital cases 
or cases under a physician's care, excluding those who die before admission to hos¬ 
pital because the course of their disease is so acute, those not sick enough to require 
hospital care, or those excluded by distance, cost, or other factors. Selection bias 
also invalidates gcneralizable conclusions from surveys that would include only vol¬ 
unteers from a healthy population. 

A special example is berkson's bias, 1 which Berkson characterized as the sel of 
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selective factors that lead hospital cases and controls in a case control study to be 
systematically different from one another. This occurs when the combination of 
exposure and disease under study increases the risk of hospital admission, thus 
leading to a systematically higher exposure rate among the hospital cases than the 
hospital controls. This in turn results in systematic distortion of the odds ratio. 

1 Berkson |: Limitations of the application of fourfold tabic analysis to hospital dau. Biometrics Bull 

2:17-53. 1916. 

bias due to withdrawals A difference between the true value and that actually ob¬ 
served in a study due to the characteristics of those subjects who choose to with¬ 
draw. 

Bills of Mortality Weekly and annual abstracts of christenings and burials, distin¬ 
guishing deaths from the plague, compiled for London (and some other cities), 
especially in times of plague, from the English parish registers that started in 1558. 
From 1629, the annual bill was published regularly and included a breakdown of 
deaths by cause. These records were the basts for the earliest vital statistics, com¬ 
piled, analvzed, and discussed by John Graunt in Natural and Political Observations 
. . .on the Bids of Mortality (1662). 

rimodal distribution A distribution with two regions of high frequency separated by 
a region of low frequency of observations. A two-peak distribution. 

binary variable A variable having only two possible values, e.g. on or off, 0 or I. See 
also BIT. 

binomial distribution A probability distribution associated with two mutually exclu¬ 
sive outcomes, e.g., presence or absence of a clinical or laboratory sign, death, or 
survival. The probability distribution of the number of occurrences of a binary 
event in a sample of n independent observations, The binomial distribution is used 
to model cumulative incidence rates and prevalence rates. The Bernoulli dis¬ 
tribution is a special case of the binomial distribution with n* 1. 

bioassay The quantitative evaluation of the potency of a substance by assessing its ef¬ 
fects on tissues, cells, live experimental animals, or humans. 

Bioassay may be a direct method of estimating relative potency: groups of sub¬ 
jects are assigned to each of two (or more) preparations; the dose that is just suffi¬ 
cient to produce a specified response is measured, and the estimate is the ratio of 
the mean doses for the two (or more) groups. In this method, the death of the 
subject may be used as the “response.** 

The indirect method (more commonly used) requires study of the relationship 
between the magnitude of a dose and the magnitude of a quantitative response 
produced by it. 

biological plausibility The criterion that an observed, presumably or putatively causal 
association fits previously existing biological or medical knowledge. This judgment 
should be used cautiously since it could impede development of new knowledge 
that does not fit existing ideas. 

BIOLOGICAL TRANSMISSION See VECTOR-BORNE INFECTION. 

biometry [literally, the measurement of life) The application of statistical methods to the 
study of numerical data based on biological observations and phenomena. The term 
was coined by W. F. R. Weldon (1860-1906), a zoologist at University College, 
London, francis galton has been called "the father of biometry** for his applica¬ 
tion of statistical methods to the analysis of biological variation. However, others 
' preceded him, e.g., quetelet and touts. 

biostatistics Application of statistics to biological problems. The term is considered 
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by many biomedical scientists to mean the application of statistics specifically to 
medical problems, but its real meaning is broader. 

Biraud, Yves (1900-1965) French physician and statistician. He served the League of 
Nations and later WHO as Director of Epidemiological and Statistical Services from 
1925 to I960. In I960, he founded the first chair of Health Statistics in France, at 
the Ecote de santf pubhque in Rennes. 

birth certificate Official, legal document recording details of a live birth, usually 
comprising name, date, place, identity of parents, and sometimes additional infor¬ 
mation such as birth weight. It provides the basis for vital statistics of birth and 
birthrates in a political or administrative jurisdiction, and for the denominator for 
infant mortality and certain other vital rates. 

BIRTH COHORT See COHORT. 

BIRTH COHORT ANALYSIS See COHORT ANALYSIS. 

birth interval Interval between termination of one completed pregnancy and the 
termination of the next. 

birth order Tht ranking of siblings according to age, starting with the eldest in a 
family. The ordinal number of a given live birth in relation to all previous live 
births of the same women. Thus, 4 is the birth order of the fourth live birth occur¬ 
ring to the same woman. This strict demographic definition may be loosened to 
include all births, i.e., still-births as well as live births 

birth bate A summary rate based on the number of live births in a population over a 
given period, usually one year. 


Birth rate ** 


Number of live births to residents 
in an area in a calendar year 
Average or midyear population 
in the area in that year 


birth WEIGHT Infant's weight recorded at the lime of birth and. in some countries, 
entered on the birth certificate. Certain variants of birth weight are precisely de¬ 
fined. Low birth weight (LBW) it below 2500 g. Very low birth weight (VLBW) is 
below 1500 g. Ultralow birth weight (ULBW) is below 1000 g. Urge for gestational 
age (LGA) is birth weight above the 90th percentile. Average weight for gestational 
age (AGA) (Syn: appropriate or adequate): birth weight between 10th and 90th 
percentiles. Small for gestational age (SGA) (Syn: small for dates): birth weight 
below 10th percentile. 

bit Acronym for binary digit; the signal In computing. See also byte. 

“black box” A jargon term, meaning a method of reasoning or studying a problem, 
in which the methods, procedures, etc., as such are not described, explained, or 
perhaps even understood. Nothing is stated or inferred about the method; discus¬ 
sion and conclusions relate solely to the empirical relationships observed. An alter¬ 
native definition is the following: A method of formally relating an input, e.g., 
quantity of a drug absorbed over a period or a putative causal factor, to an output, 
e.g., the amount of the drug eliminated in a given period, or an observed effect, 
without making detailed assumptions about the mechanisms that have contributed 
to the transformation of input to output within the organism (the "black box"). 
bund(ed) study (Syn: masked study) A study in which observer^) andtor subjects are 
kept ignorant of the group to which the subjects are assigned, as in an experiment, 
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or of the population from which the subjects come, as in a nonexperimcntal study. 
When both observer and subjects are kept ignorant, we refer to a double-blind 
study. If the statistical analysis is also done in ignorance of the group to which 
subjects belong, the study is sometimes described as triple-blind, The intent of keeping 
subjects and/or investigators blinded, i.e., unaware of knowledge that might intro¬ 
duce a bias, is to eliminate the effects of such biases. To avoid confusion about the 
meaning of the word “blind" some authors prefer to describe such studies as 
“masked.” 

blocked randomization See stratified randomization. The analogue in a random- 
ized experiment of individual matching in an observational study. 

body mass index (Syn: Quetelet's index) One of the anthropometric measures of body 
mass. Defined as (weight) + (height)*. This measure has the highest correlation 
with skinfold thickness or body density and in this respect is superior to the pon- 

DERAL INDEX. 

bootstrap A technique for estimating the variance and the bias of an estimator by 
repeatedly drawing random samples with replacement from the observations at hand. 
One applies the estimator to each sample drawn, thus obtaining a set of estimates. 
The observed variance of this set is the bootstrap estimate of variance. The differ¬ 
ence between the average of the set of estimates and the original estimate is the 
bootstrap estimate of bias. 

breakpoint In helminth epidemiology, the critical mean wormload in a community, 
below* which the helminth mating frequency is too low to maintain reproduction. A 
value exceeding the breakpoint of a wormload means that the wormload will in¬ 
crease until equilibrium is reached; a value kis than or equal to the breakpoint 
means that the wormload will decrease progressively. 

byte A group of adjacent bits, commonly 4,6, or 8, operating as a unit for storage and 
manipulation of data in a computer. See also bit. 
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CALIPER MATCHING see MATCHING. 

Canadian mortality data BASE A large set of computer-stored death statistics; per¬ 
sonal identifiers and causes of all deaths in Canada since 1950 have been computer- 
stored. and the death certificates have been preserved on microfiche. This data base 
and record linkage have been used in some important historical cohort studies. See 
also NATIONAL DEATH INDEX. 

CANCER REGISTRY See REGISTER. 

CARRIER 

I. A person or animal that harbors a specific infectious agent in the absence of 
discernible clinical disease and serves as a potential source of infection. The 
carrier state may occur in an individual with an infection that is inapparent 
throughout its course (known as healthy or asymptomatic carrier), or during 
the incubation period, convalescence, and postconvalesccncc of an individual 
with a clinically recognizable disease (known as incubatory carrier or convales¬ 
cent carrier). The carrier state may be of short or long duration (temporary 
or transient carrier or chronic carrier). 1 

1 Adapted from Contrto of Communicator Data* in Man, Mth ed. Washington. DC: American Public 

Health Association, 1985. 

carrying capacity An estimate of the numbers of people that a nation, region, or the 
planet can sustain. 

case In epidemiology, a person in the population or study group identified as having 
the particular disease, health disorder, or condition under investigation. A variety 
of criteria may be used to identify cases. e.g„ individual physicians' diagnoses, re¬ 
gistries and notifications, abstracts of clinical records, surveys of the general popu¬ 
lation, population screening, and reporting of defects such as in a dental record. 
The epidemiologic definition of a case is not necessarily the same as the ordinary 
clinical definition. 

case-base study A study that starts with the identification and sampling of persons 
with the disease of interest, and then samples the entire base population (of cases 
and noncases) from which the original cases arose. This design is similar to a case 
control STUDY in most respects, but cases may appear in the comparison (base) 
sample as well as in the case sample. 

case, collateral A case occurring in the immediate vicinity of a case which has been 
the subject of an epidemiological investigation; a term used mainly in malaria con¬ 
trol programs, equivalent to the term contact as used in infectious disease epide¬ 
miology. 

case comparison study See case control study. 

case compeer study See case control study. 
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CASE CONTROL STUDY (Syn: case comparison study, case compeer study, case history 
study, case referent study, retrospective study) A study that starts with the identifi¬ 
cation of persons with the disease (or other outcome variable) of interest, and a 
suitable control (comparison, reference) group of persons without the disease. The 
relationship of an attribute to the disease is examined by comparing the diseased 
and nondiseascd with regard to how frequently the attribute is present or, if quan¬ 
titative, the levels of the attribute, in each of the groups. 

Such a study can be called "retrospective" because it starts after the onset of 
disease and looks back to the postulated causal factors. Cases and controls in a case 
control study may be accumulated "prospectively;" that is, as each new case is di- • 
agnosed it is entered in the study. Nevertheless, such a study may still be called 
"retrospective" because it looks back from the outcome to its causes. The terms 
"cases** and "controls’* are sometimes used to descrilw subjects in a randomized 
controlled trial but, the term "case control study" should not be used to describe 
such a study. 

The terms "case control study" and "retrospective study" have been used most 
often to describe this method. Other terms also used are listed above. The concept 
of the case-control study is to be found in the works of P.C.A. Louis; 1 the first 
explicit description of the method is contained in a paper by William Augustus Guy. 
who reported his analysis of the relationship between prior occupational exposure 
and the occurrence of pulmonary consumption to the Statistical Society of London 
in I843. 5 The evolution of the case-control study thereafter has been described by 
Lilienfeld and Lilienfeld. 9 The first modern use of the method was a case-control 
study of breast cancer, reported by Lane-Clay pon 4 in 1926; f rom that time onward, 
case-control studies became increasingly popular and widely used. 

'Louis FCA: Researches on Phthisis; Anatomical, Patliolagical and Therapeutical. (Trans. U’.H. 

Wolshe). London: Svdcnliam Socieiv. IK44. 

’Guv WA: Contributions to a knowledge of the influence of employments on health, y Hm Slot Sot 

6:197-211. 1843. 

* Lilienfeld AM. Lilienfeld D: A century of case-control studies—progress. J CAron I )u 32:5-13, 

1979. 

4 Lanc-Ctavpnn JE: A further report on cancer of the breast. Refit Pub HUH Med Subj 32. London: 

HMSO. 1926. 

case fatauty rate The proportion of cases of a specified condition which are fatal 
within a specified time. 


Number of deaths from a disease 

Case fatality rate (usually (in a given period) _^ 

expressed as a percentage) Number of diagnosed cases of that disease 

(in the same period) 

This definition can lead to paradox when more persons die of the disease than 
develop it during a given period. For instance, chemical poisoning that is slowly but 
inexorably fatal may cause many persons to develop the disease over a relatively 
short period of time, but the deaths may not occur until some years later and may 
be spread over a period of years during which there are no new cases. Thus, in 
calculating the case fatality rate, it is necessary to acknowledge that the time dimen¬ 
sion varies: it may be brief, e.g., covering only the period of stay in a hospital, of 
finite duration, e.g., one year, or of longer duration still. The term "case fatality 
rate" is then better replaced by a term such as "survival rate" or by the use of a 
SURVIVORSHIP TABLE. See also ATTACK RATE. 

668210SGS2 


causation of disease 


CASE HISTORY STUDY 

1. Synonym for case control study. 

2. In clinical medicine, a case report, or a report on a series of cases. 

case referent study See case control study. 

catastrophe theory A branch of mathematics dealing with large changes in the total 
system that may result from small changes in a critical variable in the system. An 
example is the sudden change in the physical state of water into steam or ice with 
rise or fall of temperature beyond a critical level. Cenain epidemics, gene frequen¬ 
cies. and beliavioral phenomena in populations may abide by the same mathemati¬ 
cal rule. Herd immunity is an example. 

catchment area Region, which may be well- or ill-defined, from which the clients of 
a particular health facility are drawn. 

causality The relating of causes to the effects they produce. Most of epidemiology 
concerns causality and several types of causes can be distinguished. It should be 
clearly stated, however, that epidemiologic evidence by itself is insufficient to estab¬ 
lish causality. 

A cause is termed "necessary" when it must always precede an effect. This effect 
need not be the sole result of the one cause. A cause is termed "sufficient" when it 
inevitably initiates or produces an effect. Any given cause may be necessary, suffi¬ 
cient, neither, or both. These possibilities are explained below. 

Four conditions under which independent variable X may cause Y 

variable X may cause Y 
X is X is 

necessary sufficient 

1. + + 

2 . + 

3 . + 

4. 

1. X is necessary and sufficient to cause I*. Both X and Y are always present 
together, and nothing but X is needed to cause X-*)\ 

2. X is necessary but not sufficient to cause Y. X must be present when Y is pres¬ 
ent, but )’ is not always present when X is. Some additional factors) must also 
be present; X and Z-*l\ 

3 . X is not necessary but is sufficient to cause )’. Y is present when X is, but X 
may or may not be present when Y is present, because Y has other causes and 
can occur without X. For example, an enlarged spleen can have many separate 
causes that are unconnected with each other; X-»F; Z-+F. 

4. X is neither necessary nor sufficient to cause Y. Again, X may or may not be 
present when Y is present. Under these conditions, however, if X is present 
with K, some additional factor must also be present. Here X is a contributory 
cause of Y in some causal sequences; X and Z~*Y; IV and These relation¬ 
ships and the logic of causal inference are discussed in Carnal Inference,' 

'Rothman KJ (Ed): Comal Inference. Chestnut Hill. MA: Epidemiology Resources Inc., I9H8. 

causation of DISEASE* factors IN The following factors have been differentiated (but 
they are not mutually exclusive): 

Predisposing factors arc those that prepare, sensitize, condition, or otherwise create 
a situation such as a level of immunity or state of susceptibility so that the host 
tends to react in a specific fashion to a disease agent, personal interaction, environ¬ 
mental stimulus, or specific incentive. Examples include age, sex. marital status, 


Of \. i:,r:ui 1 : Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 
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family size, educational level, previous illness experience, presence of concurrent 
illness, dependency, working environment, and attitudes toward the use of health 
services. These factors may be “necessary" but are rarely “sufficient" to cause the 
phenomenon under study. 

Enabhng factors are those that facilitate the manifestation of disease, disability, ill- 
health, or the use of services or conversely those that facilitate recovery from illness, 
maintenance or enhancement of health status, or more appropriate use of health 
services. Examples include income, health insurance coverage, nutrition, climate, 
housing, personal support systems, and availability of medical care. These factors 
may be “necessary" but are rarely “sufficient" to cause the phenomenon under study. 

Precipitating factors are those associated with the definitive onset of a disease, ilt- 
* ness, accident, behavioral response, or course of action. Usually one factor is more 
important or more obviously recognizable than others if several are involved and 
one may often be regarded as “necessary." Examples include exposure to specific 
disease, amount or level of an infectious organism, drug, noxious agent, physical 
trauma, personal interaction, occupational stimulus, or new awareness or knowl¬ 
edge. 

Reinforcing factors are those tending to perpetuate or aggravate the presence of a 
disease, disability, impairment, attitude, pattern of behavior, or course of action. 
They may lend to be repetitive, recurrent, or persistent and may or may not nec¬ 
essarily be the same or similar to those categorized as predisposing, enabling, or 
precipitating. Examples include repeated exposure to the same noxious stimulus (in 
the absence of an appropriate immune response) such as an infectious agent, work, 
household, or interpersonal environment, presence of financial incentive or disin¬ 
centive, personal satisfaction, or deprivation. 

CAUSES or DEATH See DEATH CERTIFICATE. 

cause-deleted Lire table A life table constructed using death rates lowered by elim¬ 
inating the risk of dying from a specified cause; its most common use is to calculate 
the gain in life expectancy that would result from the elimination of one cause. 

cause-specific rate A rate that specifies events, such as deaths, according to their 


cause. 


i m- 


censoring This term refers to the loss of subjects from a follow-up study; the occur¬ 
rence of the event of interest among such subjects is uncertain after a specified time 
when it was known that the event of interest had not occurred; it is not known, 
however, if or when the event of interest occurred subsequently. Such subjects are 
described as censored. For example, in a follow-up study with myocardial infarction 
as the outcome of interest, a subject who has not had an infarct but is killed in a 
traffic crash in year 6 is described as censored as of year 6, since it cannot be known 
when, if ever, he might have had an infarct at a later year of follow-up. This is 
censoring by competing risk; other varieties include loss to follow-up and termina¬ 
tion of the study. Examination of data for censoring requires the use of special 
analytic methods, such as life table analysis. 

census An enumeration of a population, originally intended for purposes of taxation 
and military service. Census enumeration of a population usually records identities 
of all persons in every place of residence, with age, or birth date, sex, occupation, 
national origin, language, marital status, income, and relationship to head of house- 
, hold, in addition to information on the dwelling place. Many other items of infor¬ 
mation may be included, e.g., educational level (or literacy), and health-related data 
such as permanent disability. A de facto census allocates persons according to their 
location at the time of enumeration. A de jure census assigns persons according to 
their usual place of residence at the time of enumeration. 
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census tract An area for which details of population structure are separately tabu¬ 
lated at a periodic census; normally it is the smallest unit of analysis of (published) 
census tabulations. Census tracts are chosen because they have well-defined bound¬ 
aries, sometimes the same as local political jurisdictions, sometimes defined by con¬ 
spicuous geographical features such as main roads, rivers. In urban areas census 
tracts may be further subdivided, c.g., into city blocks, but published tables do not 
contain details to this level. 

CENTILE see QUANTILES. 

cessation experiment Controlled study in which an attempt is made to evaluate the 
termination of an exposure to risk such as a living habit that is considered to be of 
etiologic importance. 

chart The medical dossier of a patient. See also information system; medical re¬ 
cord. 

check digit A single digit, derived from a multidigit number such as a case identifi¬ 
cation number, that is used as a screening test for transcription errors. 

chemoprophylaxis The administration of a chemical, including antibiotics, to prevent 
the development of an infection or the progression of an infection to active mani¬ 
fest disease. 

chemotherapy The use of a chemical to treat a clinically recognizable disease or to 
limit its further progress. 

child death rate The number of deaths of children aged 1-4 years in a given year 
per 1000 children in this age group. This is a useful measure of the burden of 
preventable communicable diseases in the child population. 

Chi-square (**) distribution A variable is said to have a chi-square distribution with 
A* degrees of freedom if it is distributed like the sum of the squares of K indepen¬ 
dent random variables, each of which has a normal distribution with mean zero and 
variance one. 

chi-square (jt s ) test Any statistical test based on comparison of a test statistic to a chi- 
square distribution. The oldest and most common chi-square tests are for detecting 
whether two or more population distributions differ from one another; these tests 
usually involve counts of data, and may involve comparison of samples from the 
distributions under study, or the comparison of a sample to a theoretically expected 
distribution. The Pearson chi-square test is probably the best known; another is the 
Mantel-Haenszel test. (Statisticians disagree about the terminal letter; a hare ma¬ 
jority of those who contributed to the discussion of this entry prefer “chi-square" 
rather than “chi-squared." Either usage is acceptable.) 

chrisoms This word, which appears in Bills of Mortality, means infants who die 
before formal baptism; therefore, the number recorded in Bills of Mortality can be 
used to estimate (albeit inaccurately) neonatal death rates in studies of historical 
demography and epidemiology. 

chronic I. Referring to a health-related state, lasting a long time. 2. Referring to ex¬ 
posure, prolonged or long-term, often with specific reference to low-intensity. 3. 
The U.S. National Center for Health Statistics defines a “chronic" condition as one 
of three months' duration or longer. 

class A term used in the theory of frequency distributions. The tout number of ob¬ 
servations made upon a particular variate may be grouped into classes according to 
convenient divisions of the variate range in order to make subsequent analyses less 
laborious, or for other reasons. A group so determined is called a "class." The 
variate values that determine the upper and lower limits of a class are called "class 


^ boundaries," the interval between them is the class interval, and the ffvauencv fall- 
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classification (Syn: categorization) Assignment to predesignated classes on the basis 
of perceived common characteristics. A means of giving order to a group of discon¬ 
nected facts. Ideally, a classification should be characterized by (I) naturalness—the 
classes correspond to the nature of the thing being classified, (2) exhaustiveness— 
every member of the group will fit into one (and only one) class in the system* (3) 
usefulness—the classification is practical. (4) simplicity—the subclasses are not ex¬ 
cessive, and (5) constructability—the set of classes can be constructed by a demon¬ 
strably systematic procedure. 

classification or diseases Arrangement of diseases into groups having common 
characteristics. Useful in efforts to achieve standardization, and therefore compa¬ 
rability, in the methods of presentation of mortality and morbidity data from dif- 

* ferent sources. May include a systematic numerical notation for each disease entry. 

Examples include the international classification of diseasfs. injuries, and 
causes or death (icd) and the international classification of health MOSLEMS 

IN PRIMARY CARE (ICHPPC). 

CLASS, social A method of socially stratifying populations, e.g., according to education, 
income, or occupation. Sec also socioeconomic classification. 

clinical decision analysis Application of decision analysis in a clinical selling with 
the aim of applying epidemiologic and other data on probability of outcomes when 
alternative decisions can be made, e.g., surgical intervention or drug treatment for 
myocardial ischemia. 

clinical epidemiologist A practitioner of clinical epidemiology. 

clinical epidemiology While some epidemiologists deplore any adjectival qualifica¬ 
tion of the discipline, a subspccialty of clinical epidemiology is sufficiently demar¬ 
cated to justify definition. There are plenty of suggested definitions. John K. Paul 1 
proposed “A marriage between quantitative concepts used by epidemiologists to 
study disease in populations and decision-making in the individual case which is the 
daily fare of clinical medicine.” Patient care is central to Sacketts definition 2 : ‘The 
application, by a phvsician who provides direct patient care, of epidemiologic and 
biometric methods to the study of diagnostic and therapeutic processes in order to 
effect an improvement in health.” While limiting the discipline to medical graduates 
in clinical practice, this definition is conceptually close to the definition of clinical 
decision analysis; the proper distinction between clinical epidemiology and clinical 
decision analysis may be that the epidemiologist works with a defined population, 
even if it is a population of patients rather than a community-based population with 
numerator and denominator in the conventional epidemiologic sense; clinical deci¬ 
sion analysis can be applied to a single patient. Abramson s definition’ is T he use 
of epidemiological principles, methods and findings in personal health care or 
community-oriented primary care, with special reference to applications in diag¬ 
nostic and prognostic appraisal, decisions concerning care and the evaluation of 
care. The term sometimes refers to any epidemiological study conducted in a clin¬ 
ical setting.” Weiss 4 defines clinical epidemiology as “The study of variation in the 
outcome of illness and of the reasons for that variation .” The existence of the above 
and other subtly different definitions suggests that this branch of epidemiology 
remains inchoate. 

l JCIin Invest 17:539-541, 1938. 

J Efndrmwt 89:125-128, 1909. 

’Personal communication, 1986. 

'Clmttal Efndfntaloff. New York: Oxford University Press, 1986. 

CLINICAL TRIAL (Syn: therapeutic trial) A research activity that involves the administra¬ 
tion of a lest regimen to human* to evaluate its efficacy and safety. The term is 
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subject to wide variation in usage, from the first use in humans without any control 
treatment to a rigorously designed and executed experiment involving test and con¬ 
trol treatments and randomization. 

See also community trial. 

cuni metrics Fcinstcin. 1 who coined this lerm, defines it as the domain concerned with 
indexes, rating scales, and other expressions that are used to describe or measure 
symptoms, physical signs, and other distinctly clinical phenomena in clinical medi¬ 
cine. Such measurements, of course, are an essential part of many epidemiologic 
studies. 

'Frimiein AR; Ctmimcfna, New Haven and London: Yale University Press. 1987. 

closed cohort A population in which membership begins at a defined time or with a 
defined event and ends only through occurrence of the study outcome or the end 
of eligibility for membership. An example is a population of women in labor being 
studied to determine the vital status of their offspring (i.e., whether live or still¬ 
born). 

cluster analysis A set of statistical methods used to group variables or observations 
into strongly interrelated subgroups. 

clustering (Syn: disease duster, time duster, time-place duster) A closely grouped 
series of events or cases of a disease or other health-related phenomena with well- 
defined distribution patterns, in relation to lime or place or both. The term is nor¬ 
mally used to describe aggregation of relatively uncommon events or diseases, e.g., 
leukemia, multiple sclerosis. 

cluster sampling A sampling method in which each unit selected is a group of per¬ 
sons (all persons in a city block, a family, etc.) rather than an individual. 

coding Translation of information, e.g., questionnaire responses, into numbered cate¬ 
gories for entry in a data processing system. 

coefficient of variation The ratio of the standard deviation to the mean. This 
is meaningful only if the variable is measured on a ratio scale. See measurement 
scale. 

cohort (from Latin cohort, warriors, the tenth part of a legion) 

1. The component of the population bom during a particular period and iden¬ 
tified by period of birth so that its characteristics (e.g., causes of death and 
numbers still living) can be ascertained as it enters successive time and age 
periods. 

2. The term “cohort” has broadened to describe any designated group of per¬ 
sons who arc followed or traced over a period of time, as in cohort study 
(prospective study). 

cohort analysis The tabulation and analysis of morbidity or mortality rates in rela¬ 
tionship lo ihc ages of a specific group of people (cohort), identified at a particular 
period of time and followed as they pass through different ages during part or all 
of their life span. In certain circumstances, e.g., studies of migrant populations, 
cohort analysis may be performed according to duration of residence of migrants 
in a country rather than year of birth, in order to relate health or mortality expe¬ 
rience lo duration of exposure. 

cohort component method A method of population projection that takes the popu¬ 
lation distributed by age and sex at a base date and carries it forward in time on 
the basis of separate allowances for fertility, mortality, and migration. 

cohort effect See generation effect, 

COHORT INCIDENCE See INCIDENCE. 

cohort slopes Arrangement of data so lhat when plotted graphically, lines connect 
points representing the age-specific rales for population segments from the same 


Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 
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Cohort curves tor poors of birth, 1660-1950* 



o 


Age 

The lino associated with each year indicates death rates 
by age*group lor persons born in that year 


Cohort slopes (tuberculosis mortality rates of successive birth generations). Death rates for 
tuberculosis, by age, United States, 1900-1960 (per 100,000 population). 

From Susser, Watson, Hopper, 1985. 


generation of birth (see diagram). These slopes represent changes in rates with age 
during the life experience of each cohort 

cohort study (Syn: concurrent, follow-up, incidence, longitudinal, prospective study) 
The method of epidemiologic study in which subsets of a defined population can 
be identified who are, have been, or in the future may be exposed or not exposed, 
or exposed in different degrees, to a factor or factors hypothesized to influence the 
probability of occurrence of a given disease or other outcome. The alternative terms 
for a cohort study, i.e., follow-up, longitudinal, and prospective study, describe an 
essential feature of the method, which is observation of the population for a suffi¬ 
cient number of person-years to generate reliable incidence or mortality rales in 
the population subsets. This generally implies study of a large population, study 
for a prolonged period (years), or both. 

cointervention In a RANDOMIZED contxouxd trial, the application of additional di¬ 
agnostic or therapeutic procedures to members of either or both the experimental 
and the control groups. 

cold chain A system of protection against high environmental temperatures for heat- 
labile vaccines, sera, and other active biological preparations. Unless the cold chain 
is preserved, such preparations are inactivated and immunization procedures, etc. 
will.be ineffective. Preservation of the cold chain h an integral part of the WHO 
expanded program on immunization in tropical countries. 

Collinearity Very high correlation between variables. 

COLONIZATION See INFECTION. 

commensal Literally, eating together (sharing the same table); an organism that lives 
harmlessly in the gut. See also xenobiotic. 

common source endemic (Syn: common vehicle epidemic) See endemic, common 
source, 
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common vehicle spread Spread of disease agent from a source that is common 
to those who acquire the disease, e.g., water, milk, shellfish, foods, air, or syringe 
contaminated by infectious or noxious agents. See also transmission of infec¬ 
tion. 

communicable disease (Syn: infectious disease) An illness due to a specific infectious 
agent or its toxic products that arises through transmission of that agent or its 
products from an infected person, animal, or reservoir to a susceptible host, either 
directly or indirectly through an intermediate plant or animal host, vector, or the 
inanimate environment. See also transmission of infection. 

communicable pcriod The time during which an infectious agent may be transferred 
directly or indirectly from an infected person to another person, from an infected 
animal to man, or from an infected person to an animal, including arthropods. See 
also TRANSMISSION OF INFECTION. 

community A group of individuals organized into a unit, or manifesting some unifying 
trait or common interest; loosely, the locality or catchment area population for which 
a service is provided, or more broadly, the state, nation, or body politic. 

community diagnosis The process or appraising the health status of a community, 
including assembly of vital statistics and other health-related statistics and of infor¬ 
mation pertaining to determinants of health, such as prevalence of tobacco smok¬ 
ing, and examination of the relationships of these determinants to health in the 
specified community. The term may also denote the findings of this diagnostic pro¬ 
cess. Community diagnosis may attempt to be comprehensive, or may be restricted 
to specific health conditions, determinants, or subgroups. J.N. Morris 1 identified 
community diagnosis as one of the uses of epidemiology. 

'BrMrdJ 2:395-401. 1955. 

COMMUNITY HEALTH See PUBLIC HEALTH. 

community medicine Since the late 1960s, this term has gained wide currency as the 
preferred name for important activities concerning health care in the community. 
There are several different definitions, including the following. 

1. The field concerned with the study of health and disease in the population of 
a defined community or group, lu goal is to identify the health problems and 
needs of defined populations, to identify means by which these needs should 
be met, and to evaluate the extent to which health services effectively meet 
these needs. 

2. The practice of medicine concerned with groups or populations rather than 
with individual patients. This includes the elements listed in definition I, to¬ 
gether with the organization and provision of health care at a community or 
group level. 

3. The term is also used to describe the practice of medicine in the community, 
e.g., by a family physician. Some writers equate the terms "family medicine" 
and "community medicine"; others confine its use to public health practice. 

4. Community-oriented primary health care is an integration of community 
medicine with the primary health care of individuals in the community. In 
this form of practice the community practitioner or community health team 
has responsibility for health care both at a community and at an individual 
level. 

See also public health; social medicine. 

community trial Experiment in which the unit of allocation to receive a preventive or 
therapeutic regimen is an entire community or political subdivision. Examples in¬ 
clude the trials of fluoridation of drinking water, and of heart disease prevention 
in North Karelia (Finland) and California. See also clinical trial. 
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ooMOftaiomr Diseases) that coexist(s) in a study participant in addition to the index 
condition that is the subject of study. 

comparison croup Any group to which the index group is compared. Usually synony¬ 
mous with control group. 

competing gJiuse When a previously common cause of death becomes rare, other causes 
become more prominent. These other causes are referred to as competing causes. 
For instance, among young adults, pneumonia and other infections were a common 
cause of death until about midway through the 20th century; their control has 
brought to prominence some competing causes of death, notably malignant disease 
and suicide. 

competing risk An event that removes a subject from being at risk for the outcome 
under investigation. For example, in a study of smoking and cancer of the lung, a 
subject who dies of coronary* heart disease is no longer at risk of lung cancer, and 
in this situation, coronary heart disease is a competing risk. 
completed fertility rate The number of children born alive per woman in a cohort 
of women by the end of their child-bearing years. 
completing the cunical PICTURE The use of epidemiology to define all modes of 
presentation of a disease, and/or all possible outcomes. One of the "uses of epide¬ 
miology" identified bvJ.N. Morris. 1 
'BrMrdJ 2:595-401, 1955, 

completion rate The proportion or percentage of persons in a survey for whom 
complete data are available for analysis. See also response rate, 
composite index An index, such as the Apgar score, Tumor/Nodes/Meiastates (TNM) 
stage of cancer, that contains contributions from categories of several different vari¬ 
ables. 

computer A programmable electronic device that can be used to store and manipulate 
data in order to earn out designated functions. The two fundamental components 
of a computer are hardware, i.e., the actual electronic device, and software, the 
instructions or program used to carry out the function, Computer science has cre¬ 
ated a large language of its own. describing types of computers (main-frame, micro, 
digital, analogue, etc.) and all aspects of the process. Most of the terms used in this 
field are defined by AJ Meadows, M C»ordon, and A Singleton. 1 
'Dictionary of Neu* Information Technology. London: Century, I9H2. 

concordance Pairs or groups of individuals of identical phenotype. In twin studies, a 
condition in which both twins exhibit or fail to exhibit a trait under investigation. 
concordant A term used in twin studies to describe a twin pair in which both twins 
exhibit a certain trait. 

CONCURRENT STUDY See COHORT STUDY. 

conditional PROBABILITY The probability of an event, given that another event has 
occurred. If D and £ are two events and P(, , .) is "the probability of (. . .),* the 
conditional probability of D, given that £ occurs, is denoted P(D\E), where the ver¬ 
tical slash is read "given" and is equal to P(0 and E)IP(E). The event E is the "con¬ 
ditioning event." Conditional probabilities obey all the axioms of probability theory. 
See also saves' theorem; probability theory, 
confidence interval A range of values for a variable of interest, e.g., a rate, con¬ 
structed so that this range has a specified probability of including the true value of 
the variable. The specified probability is called the confidence level, and the end 
* points of the confidence interval are called the confidence limits. 
confounding |from the Latin confundere, to mix together] 

I. A situation in which the effects of two processes are not separated. The dis¬ 


tortion of the apparent effect of an exposure on risk brought about by the 
association with other factors that can influence the outcome. 

2. A relationship between the effects of two or more causal factors as observed 
in a set of data, such that it is not logically possible to separate the contribution 
that any single causal factor has made to an effect. 

S. A situation in which a measure of the effect of an exposure on risk is distorted 
because of the association of exposure with other factors) that influence the 
outcome under study. 

confounding VARIABLE (Syn: confounder) A variable that can cause or prevent the 
outcome of interest, is not an intermediate variable, and is not associated with the 
factor under investigation. Such a variable must be controlled in order to obtain an 
undistorted estimate of the effect of the study factor on risk. 

consanguine Related by a common ancestor within the previous few generations. 

CONSISTENCY 

1. Close conformity between the findings in different samples, strata, or popu¬ 
lations, or at different times or in different circumstances, or in studies con¬ 
ducted by different methods or different investigators. Consistency may be 
examined in order to study effect modification. Consistency of results on rep¬ 
lication of studies is an important criterion in judgments of causality. 

2. In statistics, an estimator is said to be consistent if the probability of it yielding 
estimates close to the true value approaches one as the sample siic grows larger. 

contact (of an infection) A person or animal that has been in such association with 
an infected person or animal or a contaminated environment as to have had op¬ 
portunity to acquire the infection. 

contact, direct A mode of transmission of infection between an infected host and 
susceptible host. Direct contact occurs when skin or mucous surfaces touch, as in 
shaking hands, kissing, and sexual intercourse. See also contagion; transmission 
or INFECTION. 

contact, indirect A mode of transmission of infection involving fomites or vectors. 
Vectors may be mechanical (e.g., filth flics) or biological (the disease agent under¬ 
goes part of its life cycle in the vector species). See also transmission of infection. 

4 contact, primary Person(s) in direct contact or associated with a communicable dis¬ 
ease case. 

contact, secondary Pcrson(s) in contact or associated with a primary contact. 

contagion The transmission of infection by direct contact, droplei spread, or contam¬ 
inated fomites. These are the modes of transmission specified by fracastorius in 
Df Conlagione (1546); contemporary usage is sometimes looser, but use of this term 
is best restricted to description of infection transmitted by direct contact. 

contagious Transmitted by contact; in common usage, "highly infectious." 

containment The concept of regional eradication of communicable disease, first pro¬ 
posed by Soper in 1949 for the elimination of smallpox. 1 Containment of a world¬ 
wide communicable disease demands a globally coordinated efTort so that countries 
that have effected an interruption of transmission do not become reinfected follow¬ 
ing importation from neighboring endemic areas. 

'Pan American Health Organization, OSP, CE7, W-15, Washington DC. 1949. 

CONTAMINATION 

I. The presence of an infectious agent on a body surface; also on or in clothes, 
bedding, toys, surgical instruments or dressings, or other inanimate articles or 
substances including water, milk, and food. Pollution is distinct from contam¬ 
ination and implies the presence of offensive, but not necessarily infectious, 
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matter in the environment. Contamination of a body surface does not imply a 
carrier sute. See also transmission or infection. 

2. The situation that exists when a population being studied for one condition 
or factor also possesses other conditions or factors that modify results of the 
study. In a randomized controlled trial, the inadvertent application of the 
experimental procedure to members of the control group, or inadvertent fail¬ 
ure to apply the procedure to members of the experimental group. 
contingency table A tabular cross-classification of data such that subcategories of one 
characteristic are indicated horizontally (in rows) and subcategories of another 
characteristic are indicated vertically (in columns). Tests of association between the 
characteristics in the columns and rows can be readily applied. The simplest contin- 
. gency table is the fourfold, or 2 x 2 table. Contingency tables may be extended to 
include several dimensions of classification. 
contingent variable See intermediate variable. 

continuing source epidemic (outbreak) An epidemic in which new cases of disease 
occur over a long period, indicating persistence of the disease source. 
continuous data, continuous variable Data (variable) with a potentially infinite 
number of possible values along a continuum. Data representing a continuous vari¬ 
able include height, weight, and enzyme output. 

CONTROL 

1. (v.) To regulate, restrain, correct, restore to normal. 

2. (n. or adj.) Applied to many communicable and some noncommunicable con¬ 
ditions, "control" means ongoing operations or programs aimed at reducing 
the incidence and/or prevalence, or eliminating such conditions. 

3. (n.) As used in the expressions case-control study and randomized comrot(led) 
trial, M control N means pcrson(s) in a comparison group that differs, respec¬ 
tively, in disease experience or allocation to a regimen, from the subjects of 
the study. 

4. (v.) In statistics, "control** means to adjust for or take into account extraneous 
influences or observations. 

5. (adj.) In the expression "control variable** we refer to an independent variable 
other than the hypothetical causal variable that has a potential effect on the 
dependent variable and is subject to control by analysis. 

The use of the noun "control" to describe the comparison groups in a case con* 
trol study and in a randomized comrol(led) trial can confuse the uninitiated, e.g., 
ethical review committees; the essential ethical distinction is that there may be no 
intervention in the lives or health status of the controls in a case-control study, 
whereas controls in a randomized controlled trial may be asked to undergo a pro¬ 
cedure or regimen that may affect their health; their informed consent is therefore 
essential. Consent may not be required (save to gain access to medical records) to 
study controls in a case-control study. As M.W, Susser 1 has pointed out, the use of 
the word "control" as verb, adjective, and noun may confuse even careful readers. 
The verb is best used in the sense of controlling sources of extraneous variation in 
the dependent variable, whether by design or analysts. The verb is also used in the 
sense of controlling disease or its causes. The adjective is best used to describe 
control variables in contradistinction to uncontrolled and confounding variables. 
The adjective also can be used to describe a control group assembled for compari¬ 
son with a group of cases or with an experimental group. The noun is best used to 
designate the members of a control group. 

'CbumI Thinking m the Health Seienees, New York: Oxford, 1973. 

controls, historical Persons or patients used for comparison who had the condition 
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or treatment under study at a different time, generally at an earlier period than 
the study group or cases. Historical controls are often unsatisfactory because other 
factors affecting the condition under study may have changed to an unknown ex¬ 
tent in the time elapsed. 

controls, hospital Persons used for comparison who are drawn from the popula¬ 
tion of patients in a hospital. Hospital controls are often a source of selection 

controls, matched Controls who are selected so that they arc similar to the study 
group, or cases, in specific characteristics. Some commonly used matching variables 
are age, sex, race, and socioeconomic sutus. See also matching. 

controls, neighborhood Persons used for comparison who live in the same locality 
as cases and therefore may resemble cases in environmental and socioeconomic 
criteria. 

controls, sibling Persons used for comparison who are the siblings of cases and 
therefore share genetic makeup. 

coordinates In a two-dimensional graph, the values of ordinate and abscissa that de¬ 
fine the locus or position of a point. 

cordon sanitaire The barrier erected around a focus of infection. Used mainly in the 
isolation procedures applied to exclude cases and contacts of life-threatening com¬ 
municable diseases from society. Mainly or historical interest. 

correlation The degree to which variables change together. 

' | correlation coefficient A measure of association that indicates the degree to which 
two variables have a linear relationship. This coefficient, represented by the letter 
r, can vary between + I and - I; when r « + I, there is a perfect positive linear 
relationship in which one variable varies directly with the other; when r * -1, 
there is a perfect negative linear relationship between the variables. The measure 
can be generalized to quantify the degree of linear relationship between one vari¬ 
able and several others, in which case it is known as the multiple correlation coef- 
ficient. Kendall’s Tau. Spearman's Rank Correlation, and Pearson's Product Mo¬ 
ment Correlation tests are special varieties with occasional applications in 
epidemiology. M.C. Kendall and W.R. Buckland's Dictionary cf Statistical Terms 9 gives 
details. 

•London: Longman. 1983. 

correlation, nonsense A meaningless correlation between two variables. Nonsense 
correlations sometimes occur when social, economic, or technological changes have 
the same trend over time as incidence or mortality rates. An example is correlation 
between the birth rate and the density of stork* in parts or Holland and Germany. 

See also confounding; ecological fallacy. 

cost—benefit analysis An economic analysis in which the costs of medical care and 
the loss of net earnings due to death or disability arc considered. The general rule 
for the allocation of funds in a cost-benefit analysis is that the ratio of marginal 
benefit (the benefit of preventing an additional case) to marginal cost (the cost of 
preventing an additional case) should be equal to or greater than I. 
cost-effectiveness analysis This form of analysis seeks to determine the costs and 
effectiveness of an activity, or to compare similar alternative activities to determine 
the relative degree to which they will obtain the desired objectives or outcomes. 
The preferred action or alternative is one that requires the least cost to produce a 
given level of effectiveness, or provides the greatest effectiveness for a given level 
of cost. In the health care field, outcomes are measured in terms of health status. 
cxKT-ummr analysis An economic analysis in which outcomes are measured in terms 
of their social v.vue. 
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GOVARIate A variable that is possibly predictive of the outcome under study. A covar¬ 
iate may be of direct interest to the study or may be a confounding variable or 
effect modifier. 

coverage A measure of the extent to which the services rendered cover the potential 
need for these services in a community. It is expressed as a proportion in which 
the numerator is the number of services rendered, and the denominator is the 
number of instances in which the service should have been rendered. Example: 

Number of deliveries attended by a 
Annual obstetric coverage m qualified midwife or obstetrician 
in a community Expected number of deliveries during 

the year in a given community 

Cox model See proportional hazards model. 

criterion A principle or standard by which something is judged. See also standard. 

Cronbach’s alpha (Syn: internal consistency reliability) An estimate of the correlation 
between the total score across a series of items from a rating scale and the total 
score that would have been obtained had a comparable series of items been em¬ 
ployed. 

cross-cultural studv A study in which populations from different cultural back¬ 
grounds are compared. 

crossover design A method of comparing two or more treatments or interventions in 
which the subjects or patients, upon completion of the course of one treatment, are 
switched to another. In the case of two treatments, A and B, half the subjects arc 
randomly allocated to receive these in the order A, B and half to receive them in 
the order B, A. A criticism of this design is that effects of the first treatment may 
carry over into the period when the second is given. 

CROSS-PRODUCT RATIO See ODDS RATIO. 

cross-sectional study (Syn: disease frequency survey, prevalence study) A study thai 
examines the relationship between diseases (or other health-related characteristics) 
and other variables of interest as they exist in a defined population at one particular 
time. The presence or absence of disease and the presence or absence of the other 
variables (or, if they are quantitative, their level) arc determined in each member 
of the study population or in a representative sample at one particular time. The 
relationship between a variable and the disease can be examined (I) in terms of the 
prevalence of disease in different population subgroups defined according to the 
presence or absence (or level) of the variables and (2) in terms of the presence or 
absence (or level) of the variables in the diseased versus the nondiseased. Note that 
disease prevalence rather than incidence is normally recorded in a cross-sectional 
study. The temporal sequence of cause and effect cannot necessarily be determined 
in a cross-sectional study. See also morbidity survey. 

CRUDE DEATH RATE See DEATH RATE. 

cumulative death rate The proportion of a group that dies over a specified lime 
interval. It may refer to all deaths or to deaths from specific cause(s). If follow-up 
is not complete on all persons the proper estimation of this rate requires the use of 
methods that take account of censoring. Distinct from force of mortality. 

cumulative INCIDENCE, cumulative incidence rate The number or proportion of a 
group of people who experience the onset of a health-related event during a spec¬ 
ified time interval; this interval is generally the same for all members of the group. 
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but, as in lifetime incidence, it may vary from person to person without reference 
to age. 

cumulative incidence ratio The ratio of the cumulative incidence rate in the ex¬ 
posed to the cumubtive incidence rate in the unexposed. 
cusum Acronym for cumubtive sum (of a series of measurements). This is a useful way 
to demonstrate a change in trend or direction of a scries of measurements. 1 Cal¬ 
culation begins with a reference figure, c.g. the expected average measurement. As 
each new measurement is observed, the relerence figure is subtracted, and a cu¬ 
mulative total is produced by adding each successive difference. This cumubtive 
total is the cusum. 

'AUlrrson M: An Iniroducikm to Epidemiology, 2nd ed. London: Macmillan. I9M3. 
cyclicity, seasonal The annual cycling of incidence on a seasonal basis. Certain acute 
infectious diseases, if of greater than rare occurrence, peak in one season of iltc 
year and reach the low point six months later (or in the opposite season). The onset 
of some symptoms of some chronic diseases also mav show this amplitudinal cy¬ 
clicity. Demographic phenomena such as marriage and births, and mortality Irom 
all causes and certain specific causes, may also exhibit seasonal cyclicity. 
cyclicity, secular Long-term (greater than one year) cycling of disease incidence. For 
example, measles in a large, unimmunized population has a high incidence every 
second year; hepatitis A has a higher incidence every seventh year. Such cycling is 
the result of continuous exhaustion and replacement of susceptible* in a relatively 
stable population. Secular cyclicity may have large interval swings as in the recur¬ 
rence of pandemics of influenza. 

CYST COUNT Sec WORM COUNT, 
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data dredging A jargon term, meaning analyses done on a post hoc basis without 
' benefit of presumed hypotheses, as a means of identifying noteworthy differences. 
Such analyses are sometimes done when data have been collected on a large num¬ 
ber of variables and hypotheses are suggested by the data; the scientific validity of 
data dredging is at best dubious, usually unacceptable. 
data processing Conversion (as by computer) of crude information into usable or 
storable form. Data generated by epidemiologic studies arc usually transferred to 
punch cards or optical mark-sense forms and thence to a computer for storage and 
retrieval. The term is often loosely used to mean also the statistical analysis of data 
by a computer program. See also punch card, 
death certificate A vital record signed by a licensed physician or, in some nations, 
by another designated health worker, that includes cause of death, decedent's name, 
sex, birthdate, and place of residence and of death. Occupation, birthplace, and 
other information may be included. Immediate cause of death is recorded on the 
first tine, followed by conditions giving rise to the immediate cause; the underlying 
cause is entered last. The underlying cause is coded and tabulated in official pub¬ 
lications of cause-specific mortality. Other significant conditions may also be re- 


CAUSE OF DEATH 
I 



Disease or condition directly la) . . 

fading to death • due to 

(or as i consequence of) 


Antecedent c fuses 

Morbid condition*, if any, 
giving rite to the above cauae. 
luting the underlying con¬ 
dition last 

II 


lb) . 

due to 

(or as • consequence of) 
(0 


Other significant conditions 
contributing to the death, but 
not related to the disease or 
condition causing it 
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International Standard Death Certificate. 


corded separately, as is the mode of death, whether accidental or violent, etc. The 
most important entries on a death certificate are underlying causes of death and 
cause of death. These are defined in the Ninth (1975) Revision of the International 
Classification of Diseases , as follows: 

Catues of death: The causes of death to be entered on the medical certificate of 
cause of death are all those diseases, morbkl conditions, or injuries that either re¬ 
sulted in or contributed to death and the circumstances of the accident or violence 
which produced any such injuries. 

Underlying cause of death: The underlying cause of death is (I) the disease or injury 
that initiated the train of events leading to death, or (2) the circumstances of the 
accident or violence that produced the fatal injury. 

Personal identifying information such as birthplace, parents' names (last name at 
birth), and binhdaies are included on death certificates in some jurisdictions; this 
extra information makes possible a range of record linkage studies. 
death rate An estimate of the proportion of a population that dies during a specified 
period. The numerator is the number of persons dying during the period; the 
denominator is the size of the population, usually estimated as the mid-vear popu¬ 
lation. The death rate in a population is generally calculated by the formula 

Number of deaths during 

a specified period ^ ^ 

Number of persons at risk 
of dying during the period 

This rate is an estimate of the person’time death rale, i.e., the death rate per 10” 
person-years. If the rate is low, it is also a good estimate of the cumulative death 
rate. This rate is also called the crude death rate. 
death registration area A geographic area for which mortality data are published. 
decision analysis A derivative of operations research and game theory that involves 
identifying all available chokes and potential outcomes of each, in a series of deci¬ 
sions that have to be made about aspects of patient care—diagnostic procedures, 
therapeutk regimens, prognostk expectations. Epidemiologic data play a large pan 
in determining the probabilities of outcomes following each choke that has to be 
made. The range of chokes can be plotted on a decision tree, and at each branch, 
or decision node, the probabilities of each outcome that can be predicted are dis¬ 
played. The decision tree thus portrays the chokes available to those responsible 
for patient care and the probabilities of each outcome that will follow the choke of 
a particular action or strategy in patient care. The relative worth of each outcome 
b preferably also described as a utility or quality of life, c.g., a probability of life 
expectancy or of freedom from disability. 1 
1 Paukcr $G. Eassirer J P: Decision analysis. N Engl J Med 316:250-258. 1987. 
decision tree The alternative chokes expressed in quantitative terms, available at each 
stage in the process of thinking through a problem, may be likened to branches, 
and the hierarchical sequence of options, to a tree. Hence, decision tree. It b a 
graphk devke used in decision analysis, in whkh a series of decision options are 
represented as branches and subsequent possible outcomes are represented as fur¬ 
ther branches. The decisions and the eventualities are presented in the order they 
arc likely to occur. The junction where a decision must be taken is called a decision 
node. 

deduction Reasoned argument proceeding from the general to the partkular. 
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dcgnei of freedom 

degrees of freedom (df) The number or independent comparisons that can be made 
between the members of a sample. This important concept in statistical testing can¬ 
not be defined briefly. It refers to the number of independent contributions to a 
sampling distribution (such as * 5 , /, and F distribution). In a contingency table it 
is one less than the number of row categories multiplied by one less than the num¬ 
ber of column categories. 

demand (for health services) Willingness and/or ability to seek, use, and, in some 
settings, to pay for services. Sometimes further subdivided into expressed demand 
(equaled with use) and potential demand , or need, 
democratic transition The transition from high to low fertility and mortality rales, 
usually related to technological change and industrialization. 
demography The study of populations, especially with reference to size and density, 

• fertility, mortality, growth, age distribution, migration, and vital statistics, and 
the interaction of all these with social and economic conditions. 
demonstration model An experimental health care facility, program, or system with 
built-in provision for measuring aspects such as costs per unit of service, rates of 
use by patients or clients, and outcomes of encounters between providers and users. 
The aim usually is to determine the feasibility, efficacy, effectiveness, and/or effi¬ 
ciency of the model service. 

denominator The lower portion of a fraction used to calculate a rate or ratio. The 
population (or population experience, as in person-years, passenger-miles, etc.) at 
risk in the calculation of a rate or ratio. See also numerator, 
density of population Demographic term meaning numbers of persons in relation to 
available space. 

density sampling A method of selecting controls in a case control study in which 
cases are sampled only from incident cases over a specific time period, and controls 
are sampled and interviewed throughout that period (rather than simply at one 
point in time, such as the end of the period). This method can reduce bias due to 
changing exposure patterns in the source population. 
dependency ratio Proportion of children and old people in a population in compari¬ 
son to all others, i.e., the proportion of economically inactive to economically active; 
“children" are usually defined as ages under 15 and “old people” as ages 65 and 
over. 

dependent variable 

1. A variable the value of which is dependent on the effect of other variable(s) 
[independent variable(s)] in the relationship under study. A manifestation or 
outcome whose variation we seek to explain or account for by the influence of 
independent variables. 

2. In statistics, the dependent variable is the one predicted by a regression equa¬ 
tion. 

See also independent variable. 

descriptive study A study concerned with and designed only to describe the existing 
distribution of variables, without regard to causal or other hypotheses. Contrast 
analytic study. An example is a community health survey, used to determine the 
health status of the people in a community. Descriptive studies, c.g., analyses of 
cancer registry data, can be used to measure risks. 

DESIGN See RLSEARCH DESIGN. 

DESIGN VARIABLE 

I. A study variable whose distribution in the subjects is determined by the inves¬ 
tigator. 
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2. In statistics, a variable taking on the value I to indicate membership in a par¬ 
ticular category and 0 or - I to indicate nonmembership in the category. Used 
primarily in analysis of variance. 

determinant Any factor, whether event, characteristic, or other definable entity, that 
brings about change in a health condition, or other defined characteristic. See also 

CAUSALITY, FACTORS IN. 

diagnosis The process of determining health status and the factors responsible for 
producing it; may be applied to an individual, family, group, or community. The 
term is applied both to the process of determination and to its findings. See also 

DISEASE LABEL. 

diagnostic index A system for recording diagnoses, diseases, or problems of patients 
or clients in a medical practice or service, usually including identifying information 
(name, birthdate, sex) and dates of encounters. Sec also e-book. 

differential The difference^) shown in tabulation of health and vital statistics ac¬ 
cording iu age. sex, or some other factor; age differentials are the differences re¬ 
vealed in the tabulations of rates in age-groups, sex differentials are the differences 
in rates between males and females, income differentials are differences between 
designated income categories, etc. 

digit preference A preference for certain numbers that leads to rounding off mea¬ 
surements. Rounding off may be lo the nearest whole number, even number, mul¬ 
tiple of 5 or 10, or (when time units like a week are involved) 7, 14, etc. This can 
be a lorm or observer variation, or an attribute of respondent(s) in a survey. 

dimensionality The number of dimensions, i.e., scalar quantities, needed for accurate 
description of an elemeni of a vector space. 

direct adjustment, direct standardization See standardization. 

disability Temporary or long-term reduction of a person's capacity to function in so¬ 
ciety. See also international classification of impairments, disabilities, and 
handicaps lor the official WHO definition. 

discordant A term used in twin studies to describe a twin pair in which one twin 
exhibits a certain irait and the other does not. Also used in matched pair case 
control studies to describe a pair whose members had different exposures to the 
risk factor under study. Only the discordant pairs are informative about the asso¬ 
ciation between exposure and disease. 

discrete data Dau that can be arranged into naturally occurring or arbitrarily se¬ 
lected groups or sets of values, as opposed to dau in which there are no naturally 
occurring breaks in continuity, i.e., continuous data. An example is number of 
decayed, missing, and filled leeth (DMF). 

discriminant ANALYSIS A statistical analytic technique used with discrete dependent 
variables, concerned with separating sets of observed values and allocating new val¬ 
ues; can sometimes be used instead of regression analysis. Kendall and Auckland 1 
refer to this as “discriminatory analysis" and describe it as a rule for allocating 
individuals or values from two or more discrete populations to the correct popula¬ 
tion with minimal probability of misdassification. 

' Kendall MG, Auckland WR: A Dictionary of Statistical Terms , 4th ed. London: Longman, 1982. 

disease Literally, dis-ease , the opposite of rase, when something is wrong with a bodily 
function. The words “disease," "illness," and “sickness" are loosely interchangeable* 
but are better regarded as not wholly synonymous. M. W. Susser has suggested that 
they be used as follows; 

Disease is a pliyskilogical/psychological dysfunction, 

Illness is a subjective suie of the person who feels aware of not being well; 


Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 





39 




dbeue frequency eurvey M 

Sickness it a stale of social dysfunction, i.e.. a role that the individual auumes 
when ill. 

disease frequency survey Sec cross-sectional study; morbitity survey, 
disease label The identity of the condition from which a patient suffers. It may be 
the name of a precisely defined disorder identified by a bauery of tests, a probabil¬ 
ity statement based on consideration of what is most likely among several possibli- 
ties, or an opinion based on pattern recognition. Use of the word "label" can convey 
stigma, so this term should be used with care, if at all. See also diagnosis. 

DISEASE ODDS RATIO See ODDS RATIO. 

disease, precunical Disease with no signs or symptoms, because they have not yet 
developed. See also inapparent infection, 
disease registry See register, registry. 

disease, sim clinical A condition in which disease is detectable by special tests but does 
not reveal itself by signs or symptoms. 
disease taxonomy See taxonomy of disease. 

Disinfection Killing of infectious agents outside the body by direct exposure to chem¬ 
ical or physical agents. 

Concurrent disinfection is the application of dilinfective measures as soon as pos¬ 
sible after the discharge of infectious material from the body of an infected person, 
or after the soiling of articles with such infectious discharges, all personal contact 
with such discharges or articles being minimized prior to such disinfection. 

Terminal disinfection is the application of disinfective measures after the patient 
has been removed by death or to a hospital, or has ceased to be a source of infec¬ 
tion, or alter other hospital isolation practices have been discontinued. Terminal 
disinfection is rarely practiced; terminal cleaning generally suffices, along with air¬ 
ing and sunning of rooms, furniture, and bedding. Disinfection is necessary only 
for diseases spread by indirect contact; steam sterilization or incineration of bed¬ 
ding and other items is desirable after a discasc such as plague or anthrax.* 

1 Bencmon AS (Ed): Control of Communicable Diseom in Mon, Hfh ed. Washington DC: American 
Public Health Association 1935. 

disinfestation Any physical or chemical process serving to destroy or remove unde- 
sired small animal forms, particularly arthropods or rodents, present upon the per¬ 
son, the clothing, or in the environment of an individual, or on domestic animals. 
Disinfestation includes delousing for infestation with Pedkulus humanus humanus , 
the body louse. Synonyms include the terms "disinsection" and "disinsectization" 
when insects only are involved. 

distribution The complete summary of the frequencies of the values or categories of 
a measurement made on a group of persons. The distribution tells either how many 
or what proportion of the group was found to have each value (or each range of 
values) out of all the possible values that the quantitative measure can have. 
distribution-free method A method which does not depend upon the form of the 
underlying distribution. 

distribution function A function that gives the relative frequency with which a ran¬ 
dom variable falls at or below each of a series of values. Examples include the 
normal distribution, log-normal distribution, chi-square distribution, I distribution, 
f-distributton, and binomial distribution, all of which have applications in epide¬ 
miology. 

’DMF The abbreviation DMF stands for decayed, missing, and filled teeth. Lowercase 
letters, i.e., dmf, are used for deciduous dentition, upper case for permanent teeth. 
The DMF number is widely used in dental epidemiology. 
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dynamic population 

dose—response RELATIONSHIP A relationship in which a change in amount, intensity, 
or duration of exposure is associated with a change—cither an increase or a de¬ 
crease—in risk of a specified outcome. 

double-bund trial A procedure of blind assignment to study and control groups and 
blind assessment of outcome, designed to ensure that ascertainment of outcome is 
not biased by knowledge of the group to which an individual was assigned. "Dou¬ 
ble" refers to both parties, i.e., the observer^) in contact with the subjects, and the 
subjects in the study and control groups. See also bund experiment; randomized 
controlled trial. 

drift See genetic drift; social drift. 

droplet nuclei A type of particle implicated in the spread of airborne infection. Droplet 
nuclei are tiny particles (I-10 pm diameter) that represent the dried residue of 
droplets. They may be formed by (I) evaporation of droplets coughed or sneezed 
inio the air or (2) aerosolization of infective materials. See also transmission of 
infection. 

dropout A person enrolled in a study who becomes inaccessible or ineligible for fol¬ 
low-up, c.g., because of inability or unwillingness to remain enrolled in the study. 
The occurrence of dropouts can lead to biases in study results. 

DUMMY variable See indicator variable. 

dynamic population A population that gains and loses members; all natural popula¬ 
tions are dynamic, a fact recognized by the term "population dynamics" used by 
demographers to denote changing composition. See also population dynamics; 
stable population. 



Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 
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EARLY WARNING system In disease surveillance, a specific procedure to detect as early 
as possible any departure from usual or normally observed frequency of phenonv 
. ena. For example, the routine monitoring of numbers of deaths from pneumonia 
and influenza in large American cities is an early warning system for the identifi¬ 
cation of influenza epidemics. In developing countries, a change in children's av¬ 
erage weights is an early warning signal of nutritional deficiency. 

E-rook Method (developed by Eimerl) 1 of recording encounters in primary medical 
care: encounters are arranged by problem or diagnostic category, thus making it 
easy to count the number of persons seen (and the number of times each is seen) 
according to problem or diagnostic category in a given period of time. Widely used 
in epidemiologic studies of primary medical care. See also age-sex register; diag¬ 
nostic INOEX. 

1 Eimerl TS: Organized curiosity.,/ Coll Gen Praettt 3:246-252, I960. 

ecological analysis Analysis based on aggregated or grouped data; errors in infer¬ 
ence may result because associations may be artifactually created or masked by the 
aggregation process. 

ecological correlation A correlation in which the units studied are populations rather 
lhan individuals. Correlations found in this manner may not hold true for the in¬ 
dividual members of these populations. See also ecological fallacy. 

ecological fallacy (Syn: aggregation bias, ecological bias) 

1. The bias that may occur because an association observed between variables on 
an aggregate level does not necessarily represent the association that exists at 
an individual level. 

2. An error in inference due to failure to distinguish between different levels of 
organization. A correlation between variables based on group (ecological) 
characteristics is not necessarily reproduced between variables based on indi¬ 
vidual characteristics; an association at one level may disappear at another, or 
even be reversed. Example: At the ecological level, a correlation has been found 
in several studies between the quality of drinking water and mortality rates 
from heart disease; it would be an ecological fallacy to infer from this alone 
that exposure to water of a particular level of hardness necessarily influences 
the individual's chances of getting or dying of heart disease. 

ecological study A study in which the units of analysis are populations or groups of 
people, rather than individuals. An example is the study of association between 
median income and cancer mortality rates in administrative jurisdictions such as 
states and counties. 

ecology The study of the relationships among living organisms and their environ¬ 
ment. “Human ecology** means the study of human groups as influenced by envi¬ 
ronmental factors, often including social and behavioral factors. 
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ecosystem The plant and animal life of a region considered in relation to the environ¬ 
mental factors that influence it; more specifically, the fundamental unit in ecology, 
comprising the living organisms and the nonliving elements that interact in a de¬ 
fined region. 

effect The result of a cause. In epidemiology, frequently a synonym for effect mea¬ 
sure. 

effectiveness The extent to which a specific intervention, procedure, regimen, or ser¬ 
vice. when deployed in the field, does what it is intended to do for a defined pop¬ 
ulation. 1 r 

EFFECT MEASURE A quantity that measures the effect ora factor on the frequency or 
risk of a health outcome. Three such measures are attributable fractions, which 
measure the fraction of cases due to a factor; risk and rate differences, which mea¬ 
sure the amount a factor adds to the risk or rate of a disease; and risk and rate 
ratios, which measure the amount by which a factor multiplies the risk or rate of 
disease. 

EFFECT modifier (Syn: conditional variable, moderator variable) A factor that modifies 
the effect of a putative causal factor under study. For example, age is an effect 
modifier for many conditions, and immunization status is an effect modifier for the 
consequences of exposure to pathogenic organisms. Effect modification is detected 
bv varying the selected effect measure for the factor under study across levels of 
another factor. See also causality, factors in; interaction. 

effective sample size Sample size after dropouts, deaths, and other specified exclu¬ 
sions from an original sample. 

efficacy The extent to which a specific intervention, procedure, regimen, or service 
produces a beneficial result under ideal conditions. Ideally, the determination of 
efficacy is based on the results of a randomized controlled trial. 

EFFICIENCY 

1. The effects or end-results achieved in relation to the effort expended in terms 
of money, resources, and time. The extent to which the resources used to 
provide a specific intervention, procedure, regimen, or service of known effi¬ 
cacy and effectiveness are minimized. A measure of the economy (or cost in 
resources) with which a procedure of known efficacy and effectiveness is car¬ 
ried out. 

2. In statistics, the relative precision with which a particular study design or es¬ 
timator will estimate a parameter of interest. 

egg count Sec worm count. 

ELIMINATION See ERADICATION (OF DISEASE). 

empirical Based directly on experience, e.g., observation or experiment, rather than 
on reasoning alone. 

encounter A face-to-face transaction between a personal health worker and a patient 
or client. 

endemic disease The constant presence of a disease or infectious agent within a given 
geographic area or population group; may also refer to the usual prevalence of a 
given disease within such area or group. See also holoendemic disease; hyperen¬ 
demic DISEASE. 

END results See OUTCOMES. 

environment All that which is external to the individual human host. Can be divided 
imo physical, biological, social, cultural, etc., any or all of which can influence health 
status of populations. 
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epidemic 

epidemic [from the Creek epi (upon), dfmos (people)] The occurrence in a community 
or region of cases of an illness, specific health-related behavior, or other health- 
related events clearly in excess of normal expectancy. The community or region, 
and the period in which the cases occur, are specified precisely. The number of 
cases indicating the presence of an epidemic varies according to the agent, size, and 
type of population exposed, previous experience or lack of exposure to the disease, 
and time and place of occurrence; epidemicity is thus relative to usual frequency of 
the disease in the same area, among the specified population, at the same season or 
the year. A single case of a communicable disease long absent from a population or 
first invasion by a disease not previously recognized in that area requires immediate 
reporting and full field investigation; two cases of such a disease associated in time 
and place may be sufficient evidence to be considered an epidemic. 

The word may be used also to describe outbreaks of disease in animal or plant 
populations. See also epizootic; epornithic. 

epidemic, common source (Syn: common vehicle epidemic, hobmiantic disease) Out¬ 
break due to exposure of a group of persons to a noxious influence that is common 
to the individuals in the group. When the exposure is brief and essentially simul¬ 
taneous. the resultant cases all develop within one incubation period of the disease 
(a “point” or “point source” epidemic). 

The term “holomiamic disease" was used by Stallybrass (1931) to describe out¬ 
breaks of this type, but as with several other terms created from Creek or Latin 
roots, transmission to epidemiologists who lacked a classical education, did not take 
place. 

epidemic curve A graphic plotting of the distribution of cases by time of onset. 

EPIDEMIC, MATHEMATICAL MODEL Or Sec MATHEMATICAL MODEL. 

EPIDEMIC, POINT SOURCE See EPIDEMIC, COMMON SOURCE. 

epidemiologist An investigator who studies the occurrence of disease or other health- 
related conditions or events in defined populations. The control of disease in pop¬ 
ulations is often also considered to be a task for the epidemiologist, especially in 
speaking of certain specialized fields such as malaria epidemiology. Epidemiologists 
may study disease in populations of animals and plants, as well as among human 
populations. See also clinical epidemiologist. 

epidemiology The study of the distribution and determinants of health-related states 
or events in specified populations, and the application of this study to control of 
health problems. 

There have been many definitions of epidemiology. In the past 50 years or so, 
the definition has broadened from concern with communicable disease epidemics 
to take in all phenomena related to health in populations. 

The Oxford English Diclionary (OED) gives as a definition: "Thai branch of medical 
science which treats of epidemics” and cites Parkin (1873) as a source. However, 
there was a “London Epidemiological Society" in the 1850s. The identity of the 
scholar who first used the word at that time has been lost. Epidemiology appears in 
the title of a Spanish history of epidemics, Epidemiology espanola , Madrid, 1802. 

, Epidemic is much older. The word appears in Johnson's Dictionary (1775), and 
OED gives a citation dated 1603. The word was, of course, used by Hippocrates. 

EPIDEMIOLOGY, ANALYTIC See ANALYTIC STUDY. 

epidemiology, DESCiumVE Study of the occurrence of disease or other health-related 
characteristics in human populations, General observations concerning the relation¬ 
ship of disease to basic characteristics such as age, sex, race, occupation, and social 
class; also concerned with geographic location. The major characteristics in descrip* 
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live epidemiology can be classified under the headings: persons, place, and time. 
See also observational study. 

EPIDEMIOLOGY, EXPERIMENTAL See EXPERIMENTAL EPIDEMIOLOGY. 
episode Period in which a health problem or illness exists, from its onset to its resolu¬ 
tion. See also encounter. 

epizootic An outbreak (epidemic) of disease in an animat population (often with the 
implication that it may also affect human populations). 
epornithic An outbreak (epidemic) of disease in a bird population. 
eradication (op disease) Termination of all transmission of infection by extermina¬ 
tion of the infectious agent through surveillance and containment. Eradication, as 
in the insunce of smallpox, was based on the joint activities of control and surveil¬ 
lance. Regional eradication has been successful with malaria and in some countries 
appears dose to succeeding for measles. The term “elimination” is sometimes used 
to describe eradication of diseases such as measles from a large geographic region 
or political jurisdiction. 

ERROR 

1. A false or mistaken result obtained in a study or experiment. Several kinds of 
error can occur in epidemiology, for example, due to bias. 

2. Random error is the portion of variation in a measurement that has no ap¬ 
parent connection to any other measurement or variable, generally regarded 
as due to chance. 

3. Systematic error, which often has a recognizable source, e.g., a faulty measur¬ 
ing instrument, or pattern, e.g., it is consistently wrong in a particular direc¬ 
tion. See also bias. 

error, type K (Syn: alpha error) The error of rejecting a true null hypothesis. Sec also 
SIGNIFICANCE LEVEL; STATISTICAL TEST. 

ERROR, TYPE II (Syn: beta error) The error of failing to reject a false null hypothesis. 
See also power; statistical test. 

estimate A measurement or a statement about the value of some quantity is said to be 
an estimate if it is known, believed, or suspected to incorporate some degree of 
error. 

estimator In statistics, a function for computing estimates of a parameter from ob¬ 
served data. 

ethics The branch of philosophy that deals with the distinction between right and 
wrong, with the moral consequences of human actions. Ethical principles govern 
the conduct of epidemiology, as they do all human activities; the ethical issues that 
are specific to epidemiological practice and research include informed consent, con¬ 
fidentiality, and respect for human rights. The issues have been defined, described, 
and discussed by many writers and by special committees under the auspices of 
research granting agencies and other official bodies in many countries. 1 
'See, for example, the following: Curran WJ: Protecting confidentiality in epidemiologic investi¬ 
gations by the Centers for Disease Control. N Engl J Mtd 314:1027-1028, 1986. 

Susser MW, Stein Z, Kline J: Ethics in epidemiology. Ann Amer Acad Pol Sec So 457:128-141. 1978, 
Commonwealth of Australia, National Health and Medical Research Council. Medical Research 
Ethics Committee: Report on Ethics in Epidemiological Research. Canberra, 1985. 

Stdley PD: Faith, evidence and the epidemiologist. J Public Health Pel 6:57-42, 1985. 

Gordis, L, Gold E, Seltser R: Privacy and protection in epidemiologic and medical research: Chal¬ 
lenge and responsibly. Am J Eptdemiol 105:165-168, 1977. 

National Academy of Sciences, Institute of Medicine: Ethics of Health Care. Washington, DC, 1974. 
Tancredi LR (ed): Ethical issues in epidemiologic research (Vol. VII, series in Psychosocial Epide¬ 
miology). New Brunswick, NJ: Rutgers University Press, 1986. 


Source: https://www.industrydocuments.ucsf.edu/docs/lnbj0000 
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ethnic croup A social group characterized by a distinctive social and cultural tradition, 
maintained within the group from generation to generation, a common history and 
origin, and a sense of identification with the group. Members of the group have 
distinctive features in their way of life, shared experiences, and often a common 
genetic heritage. These features may be reflected in their health and disease expe¬ 
rience. See also race. 

etiology Literally, the science of causes, causality; in common usage, cause. See also 
causality; pathogenesis. 

ETIOLOCIC FRACTION (EXPOSED) See ATTRIBUTABLE FRACTION (EXPOSED). 

ETIOLOGIC FRACTION (POPULATION) See ATTRIBUTABLE FRACTION (POPULATION). 

evaluation A process that attempts to determine as systematically and objectively as 
. possible the relevance, effectiveness, and impact of activities in the light of their 
objectives. Several varieties of evaluation can be distinguished, e.g., evaluation of 
structure, process, and outcome. See also clinical trial; effectiveness; efficacy; 
efficiency; health services research; program evaluation and review tech¬ 
niques; quality of care. 

Evan’s postulates Expanding biomedical knowledge has led to revision of henlk's 
and koch's postulates. Alfred Evans 1 developed those that follow, based on the 
Henle-Koch model. 

1. Prevalence of the disease should be significantly higher in those exposed to 
the hypothesized cause than in controls not so exposed. 

2. Exposure to the hypothesized cause should be more frequent among those 
with the disease than in controls without the disease—when all other risk 
factors are held constant. 

3. Incidence of the disease should be significantly higher in those exposed to 
the hypothesized cause than in those not so exposed, as shown by prospective 
studies. 

4. The disease should follow exposure to the hypothesized causative agent with 
a distribution of incubation periods on a bell-shaped curve. 

5. A spectrum of host responses should follow exposure to the hypothesized 
agent along a logical biological gradient from mild to severe. 

6. A measurable host response following exposure to the hypothesized cause 
should have a high probability of appearing in (hose lacking this before ex¬ 
posure (e.g., antibody, cancer cells), or should increase in magnitude if pres¬ 
ent before exposure. This response pattern should occur infrequently in per¬ 
sons not so exposed. 

7. Experimental reproduction of the disease should occur more frequently in 
animals or man appropriately exposed to the hypothesized cause than in those 
not so exposed; this exposure may be deliberate in volunteers, experimen¬ 
tally induced in the laboratory, or may represent a regulation of natural ex¬ 
posure. 

8. Elimination or modification of the hypothesized cause should decrease the 
incidence of the disease (i.e., attenuation of a virus, removal of tar from 
cigarettes). 

9. Prevention or modification of the host’s response on exposure to the hypoth¬ 
esized cause should decrease or eliminate the disease (i.e., immunization, drugs 
to lower cholesterol, speciflc lymphocyte transfer factor in cancer). 

10. All of the relationships and findings should make biological and epidemio¬ 
logic sense. 

1 Evans AS: Causation and disease: The Hente-Koch postulates revisited. Yale J Biot Med 49:175— 

195, 1976. 
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exact method A statistical method based on the actual, i.e., "exact" probability distri¬ 
bution of the study dau, rather than on an approximation such as the normal or 
chi-square distribution; for example, Fisher*s exact test. 

exact test A statistical test based on the actual null probability distribution of the 
study dau, rather than, say, normal approximation. The most common exart test 
is the Fisher-lrwin lest for fourfold ubles. 

EXCESS RATE AMONG EXPOSED See RATE DIFFERENCE. 

excess risk A term sometimes used to refer to the population excess rate and some¬ 
times to RISK DIFFERENCE. 

expanded programme on IMMUNIZATION Part of the effort to achieve "Health for All 
by the Year 2(HM), M under the auspices of WHO. UNICEF, and other international 
and bilateral aid agencies. This is a program of immunizing against diphtheria, 
teunus. measles, pertussis, poliomyelitis, and tuberculosis, conducted especially in 
developing countries. 

expectation of life (Syn: life expecuncy or expecution) The average number or 
years an individual of a given age is expected to live if current mortality rates con¬ 
tinue to apply. A statistical abstraction based on existing, age-specific death rates. 

Life expectancy at birth (At): Average number of years a newborn baby can be ex¬ 
pected to live if current mortality trends continue. Corresponds to the total number 
of years a given birth cohort can be expected to live, divided by the number of 
children in the cohort. Life expectancy at birth is partly dependent on mortality in 
the first year of life and is lower in poor than in rich countries because of the higher 
inlant and child mortality rates in the former. 

Life expectancy at a gnvn age , age x fij: The average number or additional years a 
person age x would live if current mortality trends continue to apply, based on the 
age-specific death rates for a given year. 

Life expecuncy is a hypothetical measure and indicator of current health and 
mortality conditions. It is not a rate. 

experiment A study in which the investigator intentionally alters one or more factors 
under controlled conditions in order to study the effects of so doing. 

experimental epidemiology In modern usage, this term is often equated with ran¬ 
domized controlled trials. To greenwooo and other epidemiologists in the 1920s, 
it meant the study of epidemics among colonics of experimental animals such as 
rats and mice. The original meaning of the term is preferable; if the word "exper¬ 
iment** is qualified by the adjective "epidemiologic" it is a synonym for randomized 
controlled trial. See also animal model, 

experimental study A study in which conditions are under the direct control of the 
investigator. In epidemiology, a study in which a population b selected for a planned 
trial of a regimen whose effects are measured by comparing the outcome of the 
regimen in the experimenul group w\ih the outcome of another regimen in a con¬ 
trol group. To avoid bias members of the experimenul and control groups should 
be comparable except in the regimen that is offered them. Allocation of individuals 
to experimenul or control groups is ideally by randomization. In a randomized 
controlled trial, individuals are randomly allocated; in some experiments, e.g., 
fluoridation of drinking water, whole communities have been (nonrandomly) allo¬ 
cated to experimental and control groups. 

explanatory study A study whose main objective is to explain, rather than merely 
describe, a situation, by isolating the effects of specific variables and undemanding 
the mechanisms of action. See also pragmatic study. 

EXPLANATORY VARIABLE 

I. A variable that causally explains the association or outcome under study. 
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exposed 

2. In statistics, a synonym for independent variable, 
exposed In epidemiology, the exposed group (or simply, Ike exposed) is often used to 
connote a group whose members have been exposed to a supposed cause of a dis¬ 
ease or health state of interest, or possess a characteristic that is a determinant of 
the health outcome of interest. 

EXPOSURE 

1. Proximity and/or contact with a source of a disease agent in such a manner 
that effective transmission of the agent or harmful effects of the agent may 
occur. 

2. The amount of a factor to which a group or individual was exposed; some¬ 
times contrasted with dose, the amount that enters or interacts with the orga¬ 
nism. 

3. Exposures may of course be beneficial rather than harmful, c.g., exposure to 
immunizing agents. 

EXPOSURE-ODDS RATIO See ODDS RATIO. 

exposure ratio The ratio of rates at which persons in the case and control groups of 
a case control study are exposed to the rise factor (or to the protective factor) 
of interest. 

expressivity In genetics, the extent to which a gene is expressed. 
extrapolate, extrapolation To predict the value of a variate outside the range of 
observations; the resulting prediction. See also interpolate, 
extrinsic incubation period Time required for development of a disease agent in a 
vector from the time of uptake of the agent to the lime when the vector is infective. 
See also incubation period; vector-borne infection. 
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F distribution (Syn: Variance ratio distribution) The distribution of the ratio of two 
independent quantities each of which is distributed like a variance in normally dis¬ 
tributed samples. So-named in honor of R.A. Fisher who first described this distri- 
button. 

F| (*T one”) Term used in genetics to describe first-generation progeny of a mating. 
factor (Syn; determinant) 

1. An event, characteristic, or other definable entity that brings about a change 
in a health condition or other defined outcome. See also causality, causa¬ 
tion of disease, factors in. 

2. A synonym for (categorical) independent variable, or more precisely, an in¬ 
dependent variable used to identify, with numerical codes, membership or 
qualitatively different groups. A causal role may be implied, as in “overcrowd¬ 
ing is a factor in disease transmission** where overcrowding represents the 
highest level of the factor “crowding.** 

factor analysis A set of statistical methods for analyzing the correlations among sev¬ 
eral variables in order to estimate the number of fundamental dimensions that un¬ 
derlie the observed data and to describe and measure those dimensions. Used fre¬ 
quently in the development of scoring systems for rating scales and questionnaires. 
factorial design A method of setting up an experiment or study to assure that all 
levels of each intervention or classificatory factor occur with all levels of the others. 
falbe negative Negative test result in a subject who possesses the attribute for which 
the test is conducted. The labeling of a diseased person as healthy when screening 
in the detection of disease. See also screening; sensitivity and specificity, 
paue positive Positive test result in a subject who does not possess the attribute for 
which the test is conducted. The labeling of a healthy person as diseased when 
screening in the detection of disease. See also screening; sensitivity and specific¬ 
ity. 

familial disease Disease that exhibits a tendency to familial occurrence. Familial oc¬ 
currence of disease may be due to genetic transmission, mtrafamilial transmission 
of infection or culture, interaction within the family, or the family's shared experi¬ 
ence, including its exposure to a common environment. 
family A group of two or more persons united by blood, adoptive or marital ties, or 
the common law equivalent; the family may include members who do not share the 
household but are united to other members by blood, adoptive or marital, or equiv¬ 
alent tics. Epidemiologic studies may be concerned with family members or with 
those who share the same household or dwelling unit. 
family, extended A group of persons comprising members of several generations united 
by blood, adoptive and marital, or equivalent ties. See also family, nuclear. 
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family contact disease Disease that occurs among members of the family of a worker 
who is exposed to a toxic substance and carries this home on his person or his 
clothing, causing exposure to other family members. 

family, nuclear A group of persons comprising members of a single or at most two 
generations, usually husband-wife-children, united by blood or adoptive and mar¬ 
ital or equivalent ties. 

family of classifications In nosology, a set of related classification systems describ¬ 
ing different aspects of health problems. For example, the International Classifica¬ 
tion of Disease, the international Classification of Health Problems in Primary Care, 
the International Classification of Impairments, Disabilities and Handicaps, and the 
specialty subclassifications for oncology, psychiatry, etc. developed by WHO work¬ 
ing groups constitute a "family of classifications” 

family study An epidemiologic study of a family or a group of families. The term has 
been used to describe surveillance of family groups, e.g., for tuberculosis. In ge¬ 
netics, investigation of families showing an unusual characteristic in order to deter¬ 
mine whether the characteristic clusters in certain families and if so, why. 

Fam, William (1807-1883) A medical graduate who became the first compiler of ab¬ 
stracts (statistician) to the Registrar-Ciencral in the newly established General Reg¬ 
ister Office of England in 1839 and remained there for more than 40 years. In his 
Annua! Reports, the combination of facts on death rales and vivid language drew 
attention to many inequalities of health and sickness experience between “healthy” 
and “unhealthy” districts in England. His many contributions to vital statistics and 
epidemiology arc contained in his monograph Vital Statistics (London, 1885). These 
include a statement of the relationship between incidence and prevalence, the con¬ 
cepts of person-years, retrospective and prospective approaches, observed and ex¬ 
pected numbers of events, the first workable nosology, and empirical laws about 
the natural history of epidemics. 

fatality rate The death rate observed in a designated series of persons affected by a 
simultaneous event, e.g., victims of a disaster. A term to be deprecated, because it 
can be confused with case fatality rate. 

feasibility study Preliminary study to determine practicability of a proposed health 
program or procedure, or of a larger study, and to appraise the factors that may 
inlluence its practicability. See also pilot study. 

fecundity The ability to produce live offspring, Fecundity is difficult to measure since 
it refers to the theoretical ability of a woman to conceive and carry a fetus to term. 
If a woman produces a live birth, it is known that she and her consort were fecund 
during some time in the past. 

fertility The actual production of live offspring. Stillbirths, fetal deaths, and abor¬ 
tions are not included in the measurement of fertility in a population. See also 
gravidity; parity. 

fertility rate See general fertility rate. 

fertility ratio A measure of the fertility of the population that restricts the denom¬ 
inator to the female population of appropriate age for childbearing. The fertility 
ratio is defined as 


Fertility ratio 


Number of girls under 15 years of age ^ J(KW 
Number of women in 15-49 age group 


* (Not to be confused with general fertility rate.) 
fetal DEATH (Syn: stillbirth) Death prior to the complete expulsion or extraction from 


its mother of a product of conception, irrespective of the duration of pregnancy. 
The death is indicated by the fact that after such separation the fetus does not 
breathe or show any other evidence of life, such as beating of the heart, pulsation 
of the umbilical cord, or definite movement of voluntary muscles. Defined variously 
as death after the 20th or 28th week of gestation (the definition of the length of 
gestation varies between different jurisdictions, making this event difficult to com¬ 
pare internationally). See also live birth. 

fetal death certificate (Syn: certificate of stillbirth) A vital record registering a fetal 
death or stillbirth. Some health jurisdictions require the use of a fetal death certif¬ 
icate for all products of conception, whereas others require its use only in cases in 
which gestation has reached a particular duration, usually the 20th or the 28th 
week. 

fetal death rate (Syn: stillbirth rate) The number of fetal deaths in a year expressed 
as a proportion ol the total number of births (live births plus fetal deaths) in the 
same year. 


Fetal death rate 


Number of fetal deaths in a y ear 
Number of fetal deaths plus live 
births in the same year 


x 1000 


Note that the denominator is larger than for the fetal death ratio and that the 
fetal death rate is therefore lower than the fetal death ratio, which is used in some 
jurisdictions. International comparisons of stillbirth or fetal death statistics will be 
Hawed if the distinction is not appreciated. 

fetal death ratio A measure of fetal wastage, related to the number of live births. 
Defined as 


r . . . . . Number of fetal deaths in a vear 

Number ol live births in the same year 

(Can be expressed per 1000.) 

field survey The planned collection of data in “the field.” i.e.. usually among nonin- 
stitutionalized persons in the general population. A method of establishing a rela¬ 
tionship between two or more variables in a population in numerical terms by elic¬ 
iting and collating information from existing sources (not only records but people 
who can say how they feel or what happened). See also cross-sectional study. 
Finlay, Carlos Albert (1833-1915) Cuban physician, initial investigator (1888-1891) 
of the role of Atdts aegypti (then known as Culex fasciatus ) in the transmission of 
yellow fever. His experiments were unsatisfactory, but his theory was fully con¬ 
firmed by the experiments of the team led by Reed in which he took an active part. 
Fisher*s exact test The test for association in a two-by-two table that is based upon 
the exact hypergeometric distribution of the frequencies within the table. 
fishing expedition Exploratory study to find dues and leads for further study. Al¬ 
though the term is sometimes used pejoratively, “fishing expeditions” may be done 
for worthwhile causes, e.g., to seek clues to the cause of a major life-threatening 
outbreak. A recent example was the initial investigation of Legionnaires' disease. 
fitness This word has specific meanings in several fields related to epidemiology. 

1. In population genetics, a measure or the relative survival and reproductive 
success of a given individual or phenotype, or population subgroup. 

2. In health promotion, health risk appraisal, physical fitness is a set of attributes 
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fixed cohort 

that people have or achieve, that relate lo their ability to perform physical 
activity. Intellectual and emotional fitness can also be described and to some 
extent measured. 

fixed cohort A cohort in which membership is fixed by being present at some defin- 
ing event ("zero time M ); an example is the cohort comprising survivors of the atomic 
bomb exploded at Hiroshima. See also closed cohort. 
rouow«UF Observation over a period of time of an individual, group, or initially de- 
fined population who* appropriate characteristics have been assessed in order to 
observe changes in health status or health-related variables. See also cohort. 

foulow-ut study , , . r . , 

1. A study in which individuals or populations, selected on the basis or whether 
they have been exposed to risk, received a specified preventive or therapeutic 
procedure, or possess a certain characteristic, are followed to assess the out¬ 
come of exposure, the procedure, or effect of the characteristic, e.g., occur- 
rence of disease. 

2. Synonym for cohort study. 

to mites (singular, fomes) Articles that convey infection to others because they have 
been coniaminaied by pathogenic organisms. Examples include handkerchief, 
drinking glass, door handle, clothing, and toys. 
force of morbidity (Syn: hazard rate, instantaneous incidence density, instantaneous 
incidence rate, person-time incidence rate) Theoretical measure of the number of 
new cases that occur per unit of population-time, e.g., person-years at risk. This is 
a measure of the occurrence of disease at a point in time, 1, defined mathematically 
as the limit, as to approaches zero, of 


FOURFOLD TABLE Sec CONTINGENCY TABLE. 

Fracastorius, Girolamo (1484-1553) Physician, poet, natural scientist, and a man of 
legends, said to have required surgery at birth to open fused lips and to have sur¬ 
vived a lightning bolt that killed his mother while he was in her arms as an infant. 
He gave the word M syphilis H to the world in his mock-heroic poem. Syphilis Six* 
Morbus Gallicus (1530), which explicitly described the transmission of disease by acts 
of venery. In De Contagion* (1546), be described transmission of infection by direct 
contact, by fomites, and "at a distance" by which he meant droplets. 

FRAMINGHAM btudy Probably the best known cohort study of heart disease. Since 1949, 
samples of residents of Framingham, Massachusetts, have been subjects of investi¬ 
gations of risk factors in relation to the occurrence of heart disease and later, other 
outcomes. 

Frank, Johann Pieter (1745-1821) Author of System enter voilstdndigen medkmischen 
Pohiry , which established hygiene as a systematic science and contained many sug¬ 
gestions based on epidemiologic observations. In modern terminology, Frank was 
"Director-general of public health" to the Hapsburg empire in eighteenth century 
Vienna. His System contained many sensible rules for individual good health, and 
detailed specifications for public health practice. 

frequency See occurrence. 

FREQUENCY DISTRIBUTION See DISTRIBUTION. 

FREQUENCY MATCHING See MATCHING. 

frequency polygon A graphic illustration of a distribution, made by joining a set of 
points, for each of which the abscissa is the midpoint of the class and the ordinate, 
or height, is the frequency. 


Probability that a person well at time I will develop 
the disease in the interval t 4 At 


The average value of this quantity over the interval t to (f+Al) can be estimated as 

Incident cases observed from Mo (f + AQ _ 

Number of person-time units of experience observed 
from Mo (l + Al) 

FORCE of mortality (Syn: actuarial death rate) The hazard rate or the occurrence of 
death at a point in time I, i.c., the limit as to approaches zero, of the probability 
that an individual alive at time I will die by time f+ to, divided by to. Distinct from 
cumulative death rate. 

FORECASTING A method of estimating what may happen in the future that relies on 
extrapolation of existing trends (demographic, epidemiologic, etc.). It may be less 
useful than scenario building, which has greater flexibility. For example, extra¬ 
polation of mortality trends for coronary heart disease in the early 1960s in the 
, United Stales suggested that the mortality rates would continue to rise, perhaps 
indefinitely, whereas in fact the rates began to fall soon after that lime. 

fortuitous relationship A relationship that occurs by chance and needs no further 



Frequency polygon. From Rimm et at., 1980. 

function A quality, trait, or fact that is so related to another as to be dependent upon 
and to vary with this other. 


explanation. 

forward survival estimate A procedure for estimating the age distribution at some 
bier date by projecting forward an observed age distribution. The procedure uses 
survival ratios, often obtained from model life tables. 

frTfrzioeoss ! 
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GAlton, Francis (1822-1911) A founder of the modern science of human biology and 
the inventor of several statistical methods. Perhaps he is best known as the author 
of Hereditary Genius (1869). an analysis of physical and intellectual characteristics of 
successive generations of several hundred prominent families. Observing that off¬ 
spring of parents of unusual talent, height, etc., tended toward average, he for¬ 
mulated the “Law of filial regression** (the origin of the term “regression'*). His 
statistical approaches were refined and extended by his pupil, Karl Pearson, the 
founder of modern biometry. 

Gaussian distribution See normal distribution. 

game theory A branch of mathematical logic concerned with the range of possible 
reactions to a particular strategy; each reaction can be assigned a probability and 
each reaction can lead to further action by the "adversary*** in the game. Used mainly 
in systems analysis and such applications as war-gaming, game theory has occasional 
applications in disease surveillance and control. It is also one of the underlying 
theories used in clinical decision analysis. 

gene A sequence of DNA that codes for a particular protein product or that regulates 
other genes. Genes are the biological basis of heredity and occupy precisely defined 
locations on chromosomes. 

gene pool The total of all genes possessed by reproductive members of a population. 

general fertility rate A more refined measure of fertility than the crude birth rate. 
The denominator is restricted to the number of women of childbearing age (i.e., 
15-44 or 15-49). Defined as 


General fertility rate ■ 


Number of live births in an area 
during a year 


Midyear female population age 15-44 
in same area in same year 


x 1000 


The upper age limit for this rate is 44 years in most jurisdictions. 
generation EFFECT (Syn: cohort effect) Variation in health status that arises from the 
different causal factors to which each birth cohort (see cohort) in the population 
is exposed as the environment and society change. Each consecutive birth cohort is 
exposed to a unique environment that coincides with its life span. 
generation time The interval between receipt of infection by and maximal infectivity 
of the host. This applies to both clinical cases and inapparent infections. 

With person-to-person transmission of infection, the interval between cases is de¬ 
termined by the generation time. See also incubation fcriod. 
genetic DRirr Random variation in gene frequency from generation to generation; 
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Goapertz's law 


most often observed in small populations. The process of evolution through ran¬ 
dom statistical fluctuation of genetic composition of populations. 

genetic efidemkology The science that deals with the etiology, distribution and con¬ 
trol of disease in groups of relatives, and with inherited causes of disease in popu¬ 
lations. 1 

1 Morton NE: Outline of ftnetoc epidemiology. New York: forger. 1982. 

genetic linkage Particular genes occupy specific sites in chromosomes, one member 
of each pair of chromosomes of course coming from each parent. When two genes 
are fairly close to each other in the same chromosome pair, they tend to be inher¬ 
ited together. Such genes are said to be linked, and the phenomenon is called ge¬ 
netic linkage. 

genetic penetrance The extent to which a genetically determined condition is ex¬ 
pressed in an individual. This determines the frequency with which genetic effect 
is shown in a population. 

genetics The branch of biology dealing with heredity and variation of individual 
members of a species. Its branches include population genetics, which overlaps ep¬ 
idemiology; therefore we include pertinent genetic terms in this dictionary. 

genome The array of genes carried by an individual. 

geographic pathology (Syn: medical geography) The comparative study of coun¬ 
tries. or of regions within them, with regard to variations in morbidity/monality. 
The (implied) aim of such study is usually to demonstrate that the variations arc 
caused by or related to differences in the geographic environment. 

geometric mean See mean, geometric. 

gestational ace Strictly speaking, the gestational age of a fetus is the elapsed time 
since conception. However, as the moment when conception occurred is rarely known 
precisely, the duration of gestation is measured from the first day of the last normal 
menstrual period. Gestational age is expressed in completed days or completed weeks 
(e.g„ events occurring 280-286 days after the onset of the last normal menstrual 
period are considered to have occurred at 40 weeks of gestation). 

Measurements of fetal growth, as they represent continuous variables, are ex¬ 
pressed in relation to a specific week of gestational age (e.g., the mean birth weight 
for 40 weeks is that obtained at 280-286 days of gestation on a weight-for- 
gestational-age curve). Some specified variations of gestational age are: Preterm: Less 
than 37 completed weeks (less than 259 days). Term: From 37 to less than 42 com¬ 
pleted weeks (259-293 days). Postlerm: Forty-two completed weeks or more (294 
days or more). 

“cold standard" A jargon term, used to describe a method, procedure, or measure¬ 
ment that is widely accepted as being the best available. Often used to compare with 
new methods. 

Goldberger, Joseph (1874-1927) A U.S. Public Health Service physician. Responsible 
for a brilliant series of investigations of pellagra. After logical deductions led him 
to reject the prevailing view that pellagra had an infectious origin, he conducted 
studies in several rural communities and in institutions, leading conclusively to the 
demonstration that pellagra was a dietary deficiency disease. 

Gompcrtz*s law The proportionate relationship of mortality to age. Mortality is high 
during the first year of life (infancy), drops to its lowest level in childhood, and 
gradually climbs during the third and fourth decade. After age 35 or 40, the in¬ 
crease in mortality with age tends to be logarithmic for the remainder of the life 
span, i.e. v the relative increase in mortality in each successive age class (of equal 
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size) is about constant. This law was first enunciated by the demographer Benjamin 
Gompertz, on the basis of survival curves in English villages in the 1840s. 
gonadotrophic cycle One complete round of ovarian development in the mosquito 
(or other insect vector) from the lime when the blood meal is taken to the time 
when the fully developed eggs are laid. 

goodness or rrr Degree of agreement between an empirically observed distribution 
and a mathematical or theoretical distribution. 
goodness of rrr test A statistical test of the hypothesis that data have been randomly 
sampled or generated from a population that follows a particular theoretical distri¬ 
bution or model. The most common such tests are chi-square tests. 
gradient or infection The variety of host responses to infection ranging from inap- 
parent infection to fatal illness. 

* graph Visual display of the relationship between variables; the values of one set of 
variables are plotted along the horizontal or x axis, of a second variable, along the 
vertical or v axis. Three-dimensional graphs of relationships between three variables 
can be represented and comprehended visually in two dimensions. The relationship 
between x and y may be linear, exponential, logarithmic, etc. See also axis, arscissa, 
ordinate. "Graph” is also a descriptive term for histograms, bar charts, etc. 



Graph showing abscissa, ordinate, and locus of a point, P. 
in relation to x and y axis. 

Graunt, John (1620-1674) By profession a haberdasher, he was a member of the 
small community of scholars and natural scientists in London who were Fellows of 
the Royal Society in its early years and who made important contributions to the 
natural sciences. Graunt studied the rills of mortality and used them to conduct 
the first analytic studies of vital statistics, identifying differences in mortality rates 
between the sexes, between city and country folk, and recording all in Natural and 
political observations mentioned m a following index and made upon the Bilb of Mortality 
(London, 1662). 
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gravidity The number of pregnancies (completed or incomplete) experienced by a 
woman. 

Greenwood, Major (1888-1949) Medical epidemiologist, trained in statistics by Karl 
Pearson; Greenwood was the first professor of epidemiology at the London School 
of Hygiene and Tropical Medicine. He inspired a whole generation of British epi¬ 
demiologists, introducing to the subject a level of mathematical reasoning and sta¬ 
tistical rigor it had not previously known. Author of many papers and several mon¬ 
ographs, best known of which is Epidemics and Crowd Diseases (London, 1933). 
cross reproduction rate The average number of female children a woman would 
have if she survived to the end of her childbearing years and if, throughout that 
period, she were subject to a given set of age-specific fertility rates and a given sex 
ratio at birth. This rate provides a measure of replacement fertility in the absence 
of mortality. See also net reproduction rate, 
growth rate of population A measure of population growth (in the absence of mi¬ 
gration) comprising addition of newborns to the population and subtraction of deaths. 
The result, known as natural rale of increase , is calculated as 

Live births during the year - deaths during the year ^ ^ 

Midyear population 

Alternatively, it is the difference between crude birth rate and crude death rate. 
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Hackett spleen classification A numerical means of recording the size of an en- 
larged spleen, especially in malaria. This is a 6-point scale of 0 (no enlargement) to 
5 (enlarged to umbilicus or larger). See Terminology of Malaria and of Malana Eradi¬ 
cation, Geneva: WHO, 1963, pp. 40-41, 

HALO EFFECT 

1. The effect (usually beneficial) that the manner, attention, and caring of a 
provider have on a patient during a medical encounter regardless of what 
medical procedures or services the encounter involves. See also placebo, pla¬ 
cebo EFFECT. 

2. The influence upon an observation of the observer s perception of the char¬ 
acteristics ol the individual observed (other than the characteristic under study) 
or the influence of the observer's recollection or knowledge of findings on a 
previous occasion. 

handicap Reduction in a person's capacity to fulfill a social role as a consequence of an 
impairment, inadequate training for the role, or other circumstances. Applied to 
children, the term usually refers to the presence of an impairment or other circum¬ 
stance that is likely to interfere with normal growth and development or with the 
capacity to learn. See also international classification of impairments, disabil¬ 
ities, and handicaps for the official WHO definition. 

haphazard sample Selection of a group of persons for study without thought as to 
whether they are representative of the population. The word "haphazard" here 
implies selection based on a mixture of criteria such as convenience, accessibility, 
turning up at the lime an investigation or study is in progress, and belonging to 
some existing list or registry, etc. Because they have an unknown chance of being 
unrepresentative of the population, haphazard samples are unsatisfactory for gen¬ 
eralization. 

Hardy-Weinberg law The principle that both gene and genotype frequencies will 
remain in equilibrium in an infinitely large population in the absence of mutation, 
migration, selection, and nonrandom mating. If p is the frequency of one allele and 
q is the frequency of another and I, then p* is the frequency of homozygotes 
for the allele, if is the frequency of homozygotes for the other allele, and 2pq is the 
frequency of heterozygotes. 

HARMONIC MEAN See MEAN, HARMONIC. 

Hawthorne effect The effect (usually positive or beneficial) of being under study 
upon the persons being studied; their knowledge of the study often influences their 
* behavior. The name derives from work studies by Whitehead, Dickson, Roethlis- 
berger, and others, in the Western Electric Plant, Hawthorne, Illinois, reported by 
Elton Mayo in The Social Problems of an Industrial Civilization (London: Routledge, 
1949). 


hazard A factor or exposure that may adversely affect health. 

hazard rate (Syn: force of morbidity, instantaneous incidence rate) A theoretical 
measure of the risk of occurrence of an event, e.g., death, new disease, at a point 
in time, f, defined mathematically as the limit, as Al approaches zero, of the proba¬ 
bility that an individual well at lime t will experience the event by f+A/, divided by 
Af. 

health The World Health Organization (WHO) described health in the preamble to its 
constitution as, "A state of complete physical, mental, and social well-being and not 
merely the absence of disease or infirmity." The WHO description of health has 
been criticized because of the difficulty of defining and measuring "complete" 
wellbeing. 

There are several other definitions, including the following: 

A state of dynamic balance in which an individual's or a group's capacity to cope 
with all the circumstances of living is at an optimum level. 

A state characterized by anatomical, physiological and psychological integrity, ability 
to perform personally* valued family, work and community roles; ability to deal with 
physical, biological, psychological and social stress; a feeling of well-being; and free¬ 
dom from the risk of disease and untimely death. 

Rene Dubos offered the following definition: M A modus vivendi enabling imper¬ 
fect men to achieve a rewarding and not too painful existence while they cope with 
an imperfect world." 

The word "health" is derived from the Old English Hal meaning hale, whole, 
sound in wind and limb. 

health behavior The combination of knowledge, practices, and attitudes that to¬ 
gether contribute to motivate the actions we take regarding health. Health behavior 
mav promote and preserve good health, or if the behavior is harmful, e.g., tobacco 
smoking, may be a determinant of disease. This combination of knowledge, prac¬ 
tices. and attitudes has been described and discussed by several writers, notably 
Becker. 1 See also Illness behavior, 

' Becker MH (ed): The Health Belief Model and Personal Health Behavior, Thorofare NJ: Slack. 1974. 

health care Those services provided to individuals or communities by agents of the 
health services or professions, for the purpose of promoting, maintaining, monitor¬ 
ing, or restoring health. Health care is broader than, and not limited to medical 
care, which implies therapeutic action by or under the supervision of a physician. 
The term is sometimes extended to include self-care. 

health education The process by which individuals and groups of people learn to 
l>chave in a manner conductive to the promotion, maintenance, or restoration of 
health. 

health index A numerical indication of the health of a given population derived from 
a specified composite formula. The components of the formula may be infant 
mortality rates, incidence rates for particular disease, or other health indica¬ 
tors. 

health indicator A variable, susceptible to direct measurement, that reflects the state 
of health or persons in a community. Examples include infant mortality rates, inci¬ 
dence rates based on notified cases of disease, disability days. eic. These measures 
may be used as components in the calculation of a health index. 

HEALTH PROMOTION The process of enabling people lo increase control over and im¬ 
prove their health. It involves the population as a whole in the context of their 
everyday lives, rather than focusing on people at risk for specific diseases, and is 
directed toward action on the determinants or causes of health. 
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health risk appraisal 

health risk appraisal (hra) (Syn: Health hazard appraisal [HHAJ) A generic term 
applied to methods for describing an individual's chances of becoming ill or dying 
from selected causes. The many versions now available share several common fea¬ 
tures: Starting from the average risk of death for the individual's age and sex, a 
consideration of various lifestyle and physical factors indicates whether the individ¬ 
ual is at greater or less than average risk of death from the commonest causes of 
death for his age and sex. All methods also indicate what reduction in risk could 
be achieved by altering any of the causal factors (such as cigarette smoking) that 
the individual could modify. 

The premise underlying such methods is that information on the extent to which 
an individual s characteristics, habits, and health practices are influencing his future 
risk of dving will assist health care workers in counseling their patients. 

health services Services that are performed by health care professionals, or by others 
under their direction, for the purpose of promoting, maintaining, or restoring health. 
In addition to personal health care, health services include measures for health 
protection and health education. 

health services research The integration of epidemiologic, sociological, economic, 
and other analytic sciences in the study of health services. Health services research 
is usually concerned with relationships between need, demand, supply, use, and 
outcome of health services. The aim of health services research is evaluation; sev¬ 
eral components of evaluative health services research are distinguished, viz; 

Evaluation of structure, concerned with resources, facilities, and manpower. 

Evaluation of process , concerned with matters such as where, by whom, and how 
health care is provided. 

Evaluation of output , concerned with the amount and nature of health services 
provided. 

Evaluation of outcome , concerned with the results, i.e„ whether persons using health 
services experience measurable benefits such as improved survival or reduced 
disability. 

health statistics Aggregated data describing and enumerating attributes, events, be¬ 
haviors. services, resources, outcomes, or costs related to health, disease, and health 
services. The data may be derived from survey instruments, medical records, and 
administrative documents, vital statistics are a subset of health statistics. 

health status index A set of measurements designed to detect short-term fluctua¬ 
tions in the health of members of a population; these measurements generally in¬ 
clude physical function, emotional well-being, activities of daily living, feelings, etc. 
Most indexes require the use of carefully composed questions designed with refer¬ 
ence to matters of fact rather than shades of opinion. The results are usually ex¬ 
pressed by a numerical score that gives a profile of the well-being of the individual. 

Health survey A survey designed to provide information on the health status of a 
population. It may be descriptive, exploratory, or explanatory. See also morbidity 
survey. 

healthy worker effect A phenomenon observed initially in studies of occupational 
diseases: Workers usually exhibit lower overall death rates than the general popu¬ 
lation, due to the fact that the severely ill and disabled are ordinarily excluded from 
employment. Death rates in the general population may be inappropriate for com¬ 
parison if this effect is not taken into account. 

hebdomadal mortality rate The mortality rate in the first week of life; the denom¬ 
inator is the number of live births in a year. 

Henle-Koch postulates See xoch’s postulates. 

Herd immunity The immunity of a group or community. The resistance of a group to 


invasion and spread of an infectious agent, based on the resistance to infection of 
a high proportion of individual members of the group. The resistance is a product 
of the number susceptible and the probability that those who are susceptible will 
come into contact with an infected person. In the herd immunity equation, “prob¬ 
ability of contact" is the intervening factor that reduces susceptibility to infection 
among group members to less than that anticipated from their susceptibility as un¬ 
related individuals. 

HETEROscEDAsncmr Nonconsuncy of the variance of a measure over the levels of the 
factors under study. 

hibernation See vector-borne infection. 

Hippocrates of Cos (c 460-370 BC) Greek physician, "Father of Medicine," respon¬ 
sible for careful clinical observation of many important and common diseases— 
tetanus, mumps, puerperal septicemia, etc. His writings contain important epide¬ 
miologic observations, as in the books Airs, Waters , Places , and Epidemics. His Aphor¬ 
isms also demonstrate considerable empirical epidemiologic knowledge. 

histogram A graphic representation of the frequency distribution of a variable. Rec¬ 
tangles are drawn in such a way that their bases lie on a linear scale representing 
different intervals, and their heights are proportional to the frequencies of the 
values within each of the intervals. See also bar diagram. 



SERUM CHOLESTEROL (mg/100ml| 

Histogram. From National Center for Health Statistics, 1978. 

historical cohort study (Syn: historical prospective study, nonconcurrent prospec¬ 
tive study, prospective study in retrospect) A cohort study conducted by recon¬ 
structing data about persons at a time or times in the past. This method uses exist¬ 
ing records about the health or other relevant aspects of a population as it was at 
some time in the past and determines the current (or subsequent) status of mem¬ 
bers of this population with respect to the condition of interest. Different levels of 
past exposure to risk factors) of interest must be identifiable for subsets of the 
population. See also cohort study. 

HISTORICAL CONTROL Control subject(s) for whom data were collected ai a time preced¬ 
ing that at which the data are gathered on the group being studied. Because of 
differences in exposures eic. t use of historical controls can lead to bias in analysis. 
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Hogben number A unique personal identifying number constructed by using a se¬ 
quence of digits Tor birthdate, sex, birthplace, and other identifiers. Suggested by 
the English mathematician Lancelot Hogben. Used in primary care epidemiology 
in some countries and usable in record unxace. See also identification number; 

SOUNDEX CODE. 

Holmes, Oliver Wendell (1809-1894) Physician, poet, philosopher, autocrat (“of the 
Breakfast Table"), and crusader against puerperal fever. He argued that this was 
conveyed to patients by the contaminated hands and clothes of attending physicians 
and recommended washing the hands and changing ckuhes as a way to prevent it. 
Unlike semmelweis, he succeeded in convincing the medical profession. His correct 
belief was recorded in a paper, "The Contagiousness of Puerperal Fever." 1 

'AT EntQJ Mfit Stir* 1:503-530. 1842-43. 

HOLOendemic disease A disease for which a high prevalent level of infection begins 
early in life and affects most of the child population, leading to a state of equilib¬ 
rium such that the adult population shows evidence of the disease much less com- 
monlv than do the children. Malaria in many communities is a holoendemic disease. 

Holomiantic infection See common source epidemic. 

HOMOSCEDAsncmr Constancy of the variance of a measure over the levels of the fac¬ 
tors under study. 

hospital-acquired infection See nosocomial infection. 

hospital discharge abstract system Abstraction of minimum data set from hospital 
charts for the purpose of producing summary statistics about hospitalized patients. 
Examples include the Hospital Inpatient Enquiry (HIPE) and Professional Activity 
Study (PAS). The statistical tabulations commonly include length of stay by final 
diagnosis, surgical operations, specified hospital service (i.e.. medical, surgical, 
gvnecological, etc.) and also give outcomes such as "death" and "discharged alive 
from hospital." This system cannot generally be used for epidemiologic purposes 
as it is not possible to infer representativeness or to generalize; this is because the 
data usually lack a defined denominator and the same person may be counted more 
than once in the event of two or more hospital separations in the period of study. 

hospital inpatient enquiry (hipe) Statistical tables of a 10% sample of hospital pa¬ 
tients in England and Wales, showing class of hospiul, diagnosis, length of stay, 
outcomes, etc. 

hospital separation A term used in commentaries on hospiul sutistics to describe 
the departure of a patient from hospiul without distinguishing whether the patient 
departed alive or dead (the distinction is unimporunt so far as the sutistics of 
hospiul activity such as bed occupancy are concerned). 

HOST 

1. A person or other living animal, including birds and arthropods, that affords 
subsistence or lodgment to an infectious agent under natural conditions. Some 
protozoa and helminths pass successive stages in alternate hosts of different 
species. Hosts in which the parasite atuins maturity or passes its sexual stage 
are primary or definitive hosts; those in which the parasite is in a larval or 
asexual sute are secondary or intermediate hosts. A transport host is a carrier 
in which the organism remains alive but does not undergo development. 1 

2. In an epidemiologic context, the host may be the population or group; biolog¬ 
ical, social, and behavioral characteristics of this group that are relevant to 

■ • health arc called "host factors." 

1 Benenson, op rt/. 

household One or more persons who occupy a dwelling, i.e., a place that provides 
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shelter, cooking, washing, and sleeping facilities; may or may not be a family. The 
term is also used to describe the dwelling unit in which the persons live. 
household sample subvey A survey of persons in a sample of households. This, in 
many variations, is a favored method of gathering dau for health-related and for 
many other purposes. The households may be sampled in any of several ways, e.g., 
by cluster, use of random numbers in relation to numbered dwelling units. The 
survey may be conducted by interview, telephone survey, or self-completed re¬ 
sponses to present questions. The method is used in developing nations as well as 
in the industrial world. 

human blood index Proportion of insect vectors found to contain human blood. 
human ecology Sec ecology. 

human immunodeficiency virus (hiv) The pathogenic organism responsible for the 
acquired immunodeficiency syndrome (AIDS); formerly or also known as the 
tymphadenopaihy virus (LAV), the name given by the original French discoverers 
Montagnier et al. 1 in 1983, or the human T-cell lymphotropic virus, type III (HTLV- 
III), the name given by Gallo et al.’ to the virus they reported in 1984. 

1 Barre-SiiMMissi F. Gherman |C, Rev F. et al.: Isolation of a T-lvmphotropic retrovirus from a 
patient at risk lor acquired immune deficiency syndrome (AIDS). 5nnw 220:868-871, 1983. 
’Gallo RC. Salahuddin SZ. Popovic M, et al.: Frequent detection and isolation of cytopaihic retro* 
viruses (HTLV-III) from patients with AIDS and at risk for AIDS. Scwicr 224:500-503. I9H4. 
hyperendemic disease A disease that is constantly present at a high incidence and/or 
prevalence rate and affects all age groups equally. 
hyperceometric distribution The exact probability distribution of the frequencies in 
a two-by-two contingency table, conditional on the marginal frequencies being fixed 
at their observed levels. 
hypothesis 

1. A supposition, arrived at from observation or reflection, that leads to refutable 
predictions. 

2. Any conjecture cast in a form that will allow it to be tested and refuted. 

See also null hypothesis. 
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iatrogenic disease Illness resulting from a physician's professional activity, or from 
* the professional activity of other health professionals. 

ICO See INTERNATIONAL CLASSIFICATION OF DISEASE. 

iceberg phenomenon That portion of disease that remains unrecorded or undetected 
despite physicians' diagnostic endeavors and community disease surveillance pro¬ 
cedures is referred to as the "submerged portion of the iceberg." Detected or di¬ 
agnosed disease is the "tip of the iceberg." The submerged portion comprises dis¬ 
ease not medically attended, medically attended but not accurately diagnosed, and 
diagnosed but not reported. 1 

’Last JM: The Iceberg. Lancet . 2:28-5). 1963. 

ICHPPC See INTERNATIONAL CLASSIFICATION OF HEALTH PROBLEMS IN PRIMARY CARE. 

identification numrer, identifying number Unique number given to every individ¬ 
ual ai birth or at some other milestone. Sweden has a system based on a sequence 
of digits for birthdate, sex, birthplace, and additional digits for each individual. 
Other systems, e.g., National Insurance number in the United Kingdom, Social 
Security number in the United Stales, and Social Insurance number in Canada, are 
sometimes used but are neither universal nor unique, being sometimes applied to 
whole families or at least to more than one individual. See also hogben number; 

SOUNDEX CODE. 

idiosyncrasy Webster's Dictionary defines this as a distinctive characteristic or peculi¬ 
arity of an individual. In pharmacoepidemiology, it means an abnormal reaction, 
sometimes genetically determined, following the administration of a medication. 

ILLNESS See DISEASE. 

illness behavior Conduct of persons in response to abnormal body signals. Such be¬ 
havior influences the manner in which a person monitors his body, defines and 
interprets his symptoms, takes remedial actions, and uses the health care system. 
See also health behavior. 

immunity, acquired Resistance acquired by a host as a result of previous exposure to 
a natural pathogen or foreign substance for the host, e.g., immunity to measles 
resulting from a prior infection with measles virus. 

immunity, active Resistance developed in response to stimulus by an antigen (infect¬ 
ing agent or vaccine) and usually characterized by the presence of antibody pro¬ 
duced by the host. 

immunity, natural Species-determined inherent resistance to a disease agent, e.g., re¬ 
sistance of man to virus of canine distemper. 

immunity, passive Immunity conferred by an antibody produced in another host and 
acquired naturally by an infant from its mother or artificially by administration of 
an antibody-containing preparation (antiserum or immune globulin). 

immunity, specific A state of altered responsiveness to a specific substance acquired 


through immunization or natural infection. For certain diseases (e.g., measles, 
chicken pox) this protection generally lasts for the life of the individual. 

immunization (Syn: vaccination) Protection of susceptible individuals from communi¬ 
cable disease by administration of a living modified agent (as in yellow fever), a 
suspension of killed organisms (as in whooping cough), or an inactivated toxin (as 
in tetanus). Temporary passive immunization can be produced by administration 
of antibody in the form of immune globulin in some conditions. 

impairment A physical or mental defect at the level of a body.system or organ. See 
also international classification of impairments, disabilities, and handicaps 
for the official WHO definition. 

in apparent inptction (Syn: subclinica! infection) The presence of infection in a host 
without occurrence of recognizable clinical signs or symptoms. Of epidemiologic 
significance because hosts so infected, though apparently well, may serve as silent 
or inapparent disseminators of the infectious agent. See also disease, precunical; 
disease, subclinical; vector-borne infection. 

inception rate The rate at which new spells of illness occur in a population; a term 
applied principally to short-term spells of illness such as acute respirator)' infec¬ 
tions, and preferred by some epidemiologists because an annual incidence rate for 
such conditions may exceed the numbers in the population at risk. 

incidence (Syn: incident number) The number of instances of illness commencing, or 
of persons falling ill, during a given period in a specified population. 1 More gen¬ 
erally, the number of new events, e.g., new cases of a disease in a defined popula¬ 
tion, within a specified period of lime. The term incidence is sometimes used to 
denote incidence rate. 

1 Prevalence and Incidence. WHO Bui 55:783-784. I%6. 

incidence density The person-time incidence rate; sometimes used to describe the 
hazard rate. See force of morbidity. 

incidence-density ratio (idr) The ratio of two incidence densities. See also rate ra¬ 
tio. 

incidence rate The rate at which new events occur in a population. The numerator 
is the number of new events that occur in a defined period; the denominator is the 
population at risk of experiencing the event during this period, sometimes ex¬ 
pressed as person-time. The incidence rate most often used in public health prac¬ 
tice is calculated by the formula 

Number of new events in specified period ^ 

Number of persons exposed to risk 
during this period 

In a dynamic population, the denominator is the average size of the population, 
often the estimated population at (he mid-period. If the period is a year, this is the 
annual incidence rate. This rate is an estimate of the person-time incidence rate, 
i.e., the rate per 10* person-years. If the rate is low, as with many chronic diseases, 
it is also a good estimate of the cumulative incidence rate. In follow-up studies with 
no censoring, the incidence rate is calculated by dividing the number of new cases 
in a specified period by the initial size of (he cohort of persons being followed; this 
is equivalent to the cumulative incidence rate during the period. If the number of 
new cases during a specified period is divided by the sum of the person-time units 
at risk for all persons during the period, the result is the person-time incidence 
rate. 
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Inference 





incidence study 


INCIDENCE STUDY See COHORT STUDY. 

incident number See incidence. 

INCUBATION PERIOD 

1. The lime interval between invasion by an infectious agent and appearance of 
the first sign or symptom of the disease in question. 

2. In a vector, the period between entry of the infectious agent into the vector 
and the time at which the vector becomes infective; i.e., transmission of the 
infectious agent from the vector to a fresh final host is possible (extrinsic in* 
cubalion period). 

independence Two events are said to be independent if the occurrence of one is in no 
way predictable from the occurrence of the other. Two variables arc said to be 
independent if the distribution of values of one is the same for all values of the 
other. Independence is the antonym of association. 

INDEPENDENT VARIABLE 

1. The characteristic being observed or measured that is hypothesized to influ¬ 
ence an event or manifestation (the dependent variable) within the defined 
area of relationships under study; that is. the independent variable is not in¬ 
fluenced by the event or manifestation but may cause it or contribute to its 
variation. 

2. In statistics, an independent variable is one of (perhaps) several variables that 
appear as arguments in a regression equation. 

index In epidemiology and related sciences, this word usually means a rating scale, e.g., 
a set of numbers derived from a scries of observations of specified variables. Ex¬ 
amples include the many varieties of health sutus index, scoring systems for sever¬ 
ity or stage of cancer, heart murmurs, mental retardation, etc. 

index case The first case in a family or other defined group to come to the attention 
of the investigator. See also propositus. 

index croup (Syn: index series) 

1. In an experiment, the group receiving the experimental regimen. 

2. In a case control study, the cases. 

3. In a cohort study, the exposed group. 

indicator variable In statistics, a variable taking only one of two possible values, one 
(usually I) indicating the presence of a condition, and the other (usually zero) in¬ 
dicating absence of the condition. Used mainly in regression analysis. 

indirect adjustment See standardization. 

individual variation Two types are distinguished: 

1. Intramdtvtdual variation: The variation of biological variables within the same 
individual, depending upon circumstances such as the phase of certain body 
rhythms and the presence or absence of emotional stress. These variables do 
not have a precise value, but rather a range. Examples include diurnal varia¬ 
tion in body temperature, fluctuation of blood pressure, blood sugar, etc. 

2. Intmndwidual variation ; As used by Darwin, the term means variation between 
individuals. This is the preferred usage; the first usage is better described as 
personal variation. 

induction period The period required for a specific cause to produce disease. More 
precisely, the interval from the causal action of a factor to the initiation of the 
disease. For example, a span of many years may pass between (presumably) radiation- 
induced muutions and the appearance of leukemia; this span would be the induc¬ 
tion period for radiogenic leukemia. See also incubation period; latent period. 



industrial hygiene The science and art devoted to recognition, evaluation, and con¬ 
trol of those environmental factors or stresses arising from or in the workplace, 
which may cause sickness, impaired health, and well-being, or significant discomfort 
and inefficiency among workers or among persons in the community. Alternatively, 
the profession that anticipates and controls unhealthy conditions of work to prevent 
illness among employees. 

infant mortality rate (imr) A measure of the yearly rate of deaths in children less 
than one year old. The denominator is the number of live births in the same year. 
Defined as 


Infant mortality rate 


Number of deaths in a year of 
children less than I yea r of age 
Number of live births in the same year 


x 1000 


This is often quoted as a useful indicator of the level of health in a community. 

infectibiuty The host characteristic or state in which the host is capable of being 
infected. See also infectiousness; infectivity. 

infection (Syn: colonization) The entry and development or multiplication of an in¬ 
fectious agent in the body of man or animals. Infection is not synonymous with 
infectious disease; the result may be inapparent or manifest. The presence of living 
infectious agents on exterior surfaces of the body is called “infestation" (e.g., pedi- 
culosis. scabies). The presence of living infectious agents upon articles of apparel 
or soiled articles is not infection, but represents contamination of such articles. 
See also inapparent infection; transmission of infection. 

infection, gradient of The range of manifestations of illness in the host reflecting 
the response to an infectious agent, which extends from death at one extreme to 
inapparent inf ection at the other. The frequency of these manifestations varies with 
the specific infectious disease. For example,-human infection with the virus of ra¬ 
bies is almost invariably fatal, whereas a high proportion of persons infected in 
childhood with the virus of hepatitis A, experience a subctinical or mild clinical 
infection. 

infection, latent period of The time between initiation of infection and first shed¬ 
ding or excretion of the agent. 

INFECTION, SUBCUNICAL See INAPPARENT INFECTION. 

INFECTIOUS DISEASE See COMMUNICABLE DISEASE. 

iNFEcmousNESS A characteristic of the disease that concerns the relative ease wkh which 
it is transmitted to other hosts. A droplet spread disease, for instance, is more in¬ 
fectious than one spread by direct contact. The characteristics of the portals of exit 
and entry are thus also determinants of infectiousness, as are the agent character¬ 
istics of ability to survive away from the host, and of infectivity. 

INFECTIVITY 

1. The characteristic of the disease agent that embodies capability to enter, sur¬ 
vive, and multiply in the host, A measure of infectivity is the secondary attack 
rate. 

2. The proportion of exposures, in defined circumstances, that results in infec¬ 
tion. 

inference The process of passing from observations and axioms to generalizations. In 
statistics, the development of generalization from sample data, usually with calcu¬ 
lated degrees of uncertainty. 
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infestation 

infestation The development on (rather than in) the body of a pathogenic agent, e.g.. 
body lice. Some authors use the term also to describe invasion of the gut by parasitic 

information system A combination of vital and health statistical data from multiple 
sources, used to derive information about the health needs, health resources, costs, 
use of health services, and outcomes or use by the population of a specified juris¬ 
diction. The term may also describe the automatic release from computers ol stored 
information in response to programmed stimuli. For example, parents can be no¬ 
tified when their children are due to receive booster doses of an immunizing agent 
against infectious disease. 

informed consent Voluntary consent given by a subject or by a person responsible lor 
a subject <e.g.. a parent) for participation in an investigation, immunization pro- 
gram, treatment regimen, etc., after being informed of the purpose, methods, pro¬ 
cedures. benefits, and risks. Awareness of risk is necessary for any subject to make 
an informed choke. The term also refers to content for medical care. 

INOCULATION Set VACCINATION. 

WfVT |. The sum total or resources and energies purposefully engaged in order to 
intervene in the spontaneous operation of a system. 

2. The basic resources required in terms of manpower, money, materials, and 

time. 

INSTANTANEOUS INCIDENCE RATE See FORCE Of MORBIDITY. 

instrumental error Error due to faults arising in any or in all aspects of a measuring 
instrument, i.e., calibration, accuracy, precision, etc, Also applied to error arising 
from impure reagents, wrong dilutions, etc. 

INTERACTION 

1. The interdependent operation of two or more causes to produce or prevent 
an effect. Biological interaction means the interdependent operation ol two or 
more causes to produce, prevent, or control disease. See also antagonism, 

SYNERGISM. . . f . 

2. Differences in the effects of one or more factors according to the level ol the 
remaining factor(s). Sec also effect modifier. 

3. In statistics, the necessity for a product term in a linear model. 

intermediate variable (Syn: contingent variable, intervening (causal) variable, me¬ 
diator variable) A variable that occurs in a causal pathway from an independent lo 
a dependent variable. It causes variation in the dependent variable, and itself is 
caused to van* by the independent variable. Such a variable is statistically associated 
with both the independent and dependent variables. 

INTERNAL VALIDITY Sec VALIDITY, STUDY. 

international classification of disease (icd) The classification of specific condi¬ 
tions and groups of conditions determined by an internationally representative group 
of experts who advise the World Health Organization, which publishes the com¬ 
plete list in a periodically revised book, the (Manual of the) International Statistical 
Classification of Diseases, Injuries and Causes of Death. Every disease entity is assigned 
a number. There are 17 major divisions (chapters) and a hierarchical arrangement 
or subdivisions (rubrics) within each. Some chapters are "et»ologfc ( H e.g., Infective 
and Parasitic Conditions; more relate to body systems, e.g., Circulatory System; and 
some to classes of condition, e.g., neoplasms, injury (violence). The heterogeneity 
of categories reflects prevailing uncertainties about causes of disease (and classifi- 
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cation in relation to causes). The Ninth Revision ol the Manual (ICD-9) was pub- 
Itshed by WHO in 1977, after ratification in 1976. 
international CLASSIFICATION or bealto problems in primary case (ichppc) A 
classification of diseases, conditions, and other reasons for attendance for primary 
care. May be used for labeling conditions in problem-oriented records as used by 
primary care health workers. This classification is an adaptation of the ICD but 
makes more allowance for the diagnostk uncertainty that prevails in primary care. 
This classification is now in its second revision (ICHPPC-2). See also raoauM- 

ORIENTED MEDICAL RECORD. 

INTERNATIONAL CLASSIFICATION OF IMPAIRMENTS, DISABILITIES, AND HANDICAPS 

(sciOH) Firs, published by WHO in 1980. this is an attempt to produce a systematic 
taxonomy of the consequences of injury and disease. 

An impairment is defined in ICIDH as any loss or abnormality or psychological, 
physiological, or anatomical structure or function. It is concerned with abnormali¬ 
ties or body struaure and appearance and with organ or system function resulting 
rrom any cause; m principle, impairments represent disturbances at the organ level 
A disability is defined in ICIDH as any restriction or lack (resulting from an im¬ 
pairment) of ability to perform an activity in a manner or within the range con¬ 
sidered normal for a human being. The term disability reflects the consequences of 
impairment in terms of functional performance and activity by the individual; dis¬ 
abilities thus represent disturbances at the level of the person. 

A handicap is defined in ICIDH as a disadvantage for a given individual, resulting 
Irom an impairment or a disability, that limits or prevents the fulfillment of a role 
that is normal (depending on age. sex. and social and cultural praetke) for that 
individual. The term handicap thus reflects interaction with and adaptation to the 
individual's surroundings. 

international comparison See geographic pathology. See also cross-cultural 

STUDY. 

INTERNAL VALIDITY Stt VALIDITY, STUDY. 

INTER FOLATE, interpolation To predict the value of variates within the ranee of ob¬ 
servations; the resulting prediction. 

INTERVAL INCIDENCE DENSITY See PERSON-TIME INCIDENCE RATE. 

interval scale See measurement scale. 

INTERVENING CAUSE See INTERMEDIATE VARIABLE. 

INTERVENING VARIABLE* 

1. Synonym for intermediate variable. 

2. A variable whose value is altered in order to Mock or alter the efTect(s) or 
another factor. 

See also causality, factors in. 

intervention ftudv An epidemiologk investigation designed lo test a hypoth esized 
cause-effect relationship by modifying a supposed causal factor in a population. 
interview schedule The precisely designed set of questions used in an interview. See 
also survey instrument. 

in voluntary emokinc (Syn: passive smoking) The inhalation by nonsmokers of to- 
bacco smoke left in the air by smokers; includes both smoke exhaled by smokers 
and smoke released directly from burning tobacco into ambient air; the latter is 
called sidestream smoke and contains higher proportions of loxk and other carcin¬ 
ogenic substances than exhaled smoke. The adjective "involuntary" is preferable lo 
"passive" as the latter implies acquiescence—increasingly, nonsmokers are anything 
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but acquiescent about this form of air pollution. "Passive" is, however, customary 
WHO usage. 

island population A group of individuals isolated from larger groups and possessing 
a relatively limited gene pool; alternatively, a group that is immunotogically isolated 
and may therefore be unduly susceptible to infection with alien pathogens. 
isolate (noun) Term used in genetics to describe a subpopulation (generally small) in 
which matings lake place exclusively with other members of the same subpopula¬ 
tion. 

ISOLATION 

1. In microbiology, the separation of an organism from others, usually by mak¬ 
ing serial cultures. 

2. Separation, for the period of communicability, of infected persons or animals 
from others in such places and under such conditions as to prevent or limit 
the direct or indirect transmission of the infectious agent from those infected 
to those who are susceptible or who may spread the agent to others. Control of 
Communicable Disease in Man 1 lists seven categories of isolation as follows; 

a. 'b'trtrt isolation: This category is designed to prevent transmission of highly 
contagious or virulent infections that may be spread by both air and con¬ 
tact. The specifications, in addition to those above, include a private room 
and the use of masks, gowns, and gloves for all persons entering the room. 
Special ventilation requirements with the room at negative pressure to sur¬ 
rounding areas are desirable. 

b. Contact isolation: For less highly transmissible or serious infections, for dis¬ 
eases or conditions that are spread primarily by close or direct contact. In 
addition to the bask requirements, a private room is indicated but patients 
infected with the same pathogen may share a room. Masks are indkated 
for those who come close to the patient, gowns are indkated if soiling is 
likely, and gloves are indicated for touching infectious material. 

c. Respiratory isolation: To prevent transmission of infectious diseases over short 
distances through the air, a private room is indicated but patients infected 
with the same organism may share a room. In addition to the bask require¬ 
ments, masks are indicated for those who come in dose contact with the 
patient; gowns and gloves are not indicated. 

d. Tuberculosis isolation (AFB isolation): For patients with pulmonary tubercu¬ 
losis who have a positive sputum smear or chest-x-rays that strongly suggest 
active tuberculosis. Specifications include use of a private room with special 
ventilation and the door closed. In addition to the bask requirements, masks 
are used only if the patient is coughing and docs not reliably and consis¬ 
tently cover the mouth. Gowns are used to prevent gross contamination of 
clothing. Gloves arc not indicated. 

e. Enteric precautions: For infections transmitted by direct or indirect contact 
with feces. In addition to the bask requirements, specifications include use 
of a private room if patient hygiene is poor. Masks are not indkated; gowns 

, should be used if soiling is likely and gloves are to be used for touching 
contaminated materials. 

f. Drainage/secretion precautions: To prevent infections transmitted by direct or 
indirect contact with purulent material or drainage from an infected body 
site. A private room and masking are not indkated; in addition to the bask 
requirements, gowns should be used if soiling is likely and gloves used for 
touching contaminated materials. 
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g. Blood/body fluid precautions: To prevent infection, that are transmitted by 
direct or indirect contact with infected blood or body fluids. In addition to 
the bask requirements, a private room is indicated if patient hygiene i, 
poor; masks are not indicated; gowns should be used if soiling of clothing 
with blood or body fluids is likely. Gloves should be used for touching 
blood or body fluids. 

See also quarantine. 

'Benemon AS (Ed): Control of Communicable Ihsetues in Man, Mih ed. Washington DC: American 

Public Health Association. 1985. 

isometric CHART A chart or graph that portrays three dimensions on a plane surface. 
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jackknife A technique for estimating the variance and the bias of an estimator. If the 
sample size is n, the estimator is applied to each subsample of size n — 1, obtained 
by dropping a measurement from analysis. The sum of squared differences be¬ 
tween each of the resulting estimates and their mean, multi pled by (n- !)/«, is the 
jackknife estimate of variance: the difference between the mean and the original 
estimate, multiplied by (n- 1), is the jackknife estimate of bias. 

Jenner, Coward (1749-1823) An English physician and naturalist. On the basis of the 
observation that dairymaids who had had cowpox never got smallpox, he inoculated 
a boy age 10 with cowpox (vaccinia) in 1796. Over the succeeding two years he 
inoculated 22 more persons and then attempted to inoculate them with smallpox, 
always without inducing this infection. The results of his work were published in 
An Inquiry into the Cause and Effects of the Vanolae Xatctnae (London, 1798). This 
successful method of immunizing persons and populations against smallpox led 
directly to the ultimate worldwide eradication of smallpox in 1977. 

KAP (knowledge, attitudes, practice) survey A formal survey, using face-to-face 
interviews, in which women are asked standardized pretested questions dealing with 
their knowledge ol, attitudes toward, and use of contraceptive methods. Detailed 
reproductive histories and attitudes toward desired family size are also elicited. 
Analysis of responses provides much useful information on family planning and 
gives estimates of possible future trends in population structure. The term has 
sometimes been used to describe other varieties of survey of knowledge, attitudes, 
and practice, e.g., health promotion in general or in particular, cigarette smoking. 
kapfa A measure of the degree of nonrandom agreement between observers or mea¬ 
surements of the same categorical variable 
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reporting this contained the first sutement of rock's postulates. In 1883, he dis. 
covered the cholera vibrio. He was awarded the Nobel Prize in 1905. 

Koch's postulates First formulated by Henle and adapted by Robert Koch in 1877, 
with elaborations in 1882. Koch stated that these postulates should be met before a 
causative relationship can be accepted between a particular bacterial parasite or 
disease agent and the disease in question. 

1. The agent must be shown to be present in every case of the disease by isolation 
in pure culture. 

2. The agent must not be found in cases of other disease. 

3. Once isolated, the agent must be capable of reproducing the disease in exper¬ 
imental animals. 

4. The agent must be recovered from the experimental disease produced. 

See also causality: evans’s postulates. 

kurtosis The extent to which a unimodal distribution is peaked. 


where P 9 is the proportion of times the measurements agree, and P t isthe propor¬ 
tion of times they can be expected to agree by chance alone. If the measurements 
agree more often than expected by chance, kappa is positive; if concordance is 
* complete, kappa* 1; if there is no more nor less than chance concordance, kappa*0; 
if the measurements disagree more than expected by chance, kappa is negative. 

Kendall's tau See correlation coefficient. 

Koch, Rorert (1843-1910) German physician, pathologist, and bacteriologist. One of 
the founders of microbiology and an important contributor to our understanding 
of infectious disease epidemiology. His major contributions to medical science in¬ 
clude the life cycle of anthrax, the etiology of traumatic infection, methods of fixing 
and staining bacteria, and, in 1882, the discovery of the tubercle bacillus. The paper 
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jlargc sample metmod (Syn: asymptotic method): Any statistical method based on an 
approximation to a normal or other distribution that becomes more accurate as 
sample size increases. An example is a chi square test on a set of frequencies. 
latent immunization The process of developing immunity by a single or repeated 
inapparent asymptomatic infection. Not necessarily related to latent infection. Sec 
also immunity, acquired. 

latent infection Persistence of an infectious agent within the host without symptoms 
(and often without demonstrable presence in blood, tissues, or bodily secretions of 
host). 

LATENT PERIOD (Svn: latent)) Delay between exposure to a disease-causing agent and 
the appearance of manifestations of the disease. After exposure to ionizing radia- 
lion, for instance, there is a latent period or five years, on average, before devel¬ 
opment of leukemia, and more than 20 years before development of certain other 
malignant conditions. The term “latent period" is often used synonymously with 
“induction period." that is, the period between exposure to a disease-causing agent 
and the appearance of manifesutions of the disease. It has also been defined as the 
period from disease initiation to disease detection. See also incubation period, 
induction period. 

Latin square One of the basic statistical designs for experiments that aim at removing 
from the experimental error the variation from two sources, which may be identi¬ 
fied with the rows and columns of the square. In such a design the allocation of k 
experimental treatments in the cells of a A by A (latin) square is such that each 
treatment occurs exactly once in each row and column, A design for a 5 x 5 square 
is as follows: 

A B C D E 

B A E C D 

C D A E B 

D E B A C 

E C D B A 

After Kendall and Buckland. 1 

• Kendall MC. Buckland AA: A Dictionary tfStotiUkoI Ttrmt, 4th ed. London: Longman. 1982. 
Laveran, Alphonse (1845-1922) French army surgeon who discovered the malaria 
parasite (1880) while on service in Algeria. Though initially sceptical, the scientific 
. community soon accepted the validity of Laveran's discovery, which was confirmed 
and enlarged by Golgi, Grassi, and others. Laveran was awarded the Nobel Prize 
for medicine in 1907. 
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lead time The time gained in treating or controlling a disease when detection is earlier 
than usual, e.g., in the presymptomatic stage, as when screening procedures are 
used for detection. 

lead time bias (Syn: zero time shift) Overestimation of survival time, due to the back¬ 
ward shift in the suiting point for measuring survival that arises when diseases 
such as cancer are detected early, as by screening procedures. 

least squares A principle of estimation, due to Gauss, in which the estimates of a set 
of parameters in a statistical model arc those quantities that minimize the sum of 
squared differences between the observed values of the dependent variable and the 
values predicted by the model. 

ledermann formula Lcdermann 1 showed empirically that the frequency distribution 
ol alcohol consumption in the population of consumers may be log-normal: the 
curve is sharply skewed—approximately one-third of drinkers consume more than 
609f of the total amount of alcohol. Among drinkers the proportion of persons 
with alcoholism remains constant at around 7-9%. The pattern of consumption of 
illicit drugs among users may also be log-normal. Questions have been raised, how¬ 
ever, about the validity of some assumptions upon which the formula is based. 

1 Ledermann S: Alrool, AUooiume ti AUoolualton. Paris: Presses universiiaires dc France, 1956. 

Leeuwenhoek, Antoni van (1632-1723) An early microscopist from Delft, in the 
Netherlands, the first to use his microscopes to examine and describe small crea¬ 
tures ( ammaUuUs) such as the protozoan organisms in vaginal secretions, sperma¬ 
tozoa, and with growing ability to make more powerful microscopes, infectious mi¬ 
croorganism. He was thus a key figure in the development of ihe germ theory of 
disease. 

Levin's attributable risk See attributable fraction (population). 

ufe events Changes or disruptions in the pattern of living that may be associated with 
or produce changes in hcatih. The relationship of "life stress" and "emotional stress" 
to onset of several kinds of serious chronic disease such as coronary heart disease 
and hypertension has been the subject of epidemiologic studies. The Rahe-Holmcs 
Social Readjustment Rating Scale 1 was the first to be developed to assign ranks or 
ratings to significant life events such as death of a spouse or other close relative, 
loss of regular job, relocation, marriage, divorce, etc. Many other rating scales have 
since been developed. 

1 Holmes TH. Rabe RH: The social readjustment rating scale. J Pnckoumot* Rn 1:213-218. 1967. 

ufe expectancy See expectation of life. 

UFE expectancy FREE FROM DISABILITY (lefd) An estimate of life expectancy adjusted 
for activity-limitation (data for which are derived from hospital discharge statistics, 
etc.). Sec also qaly. 

ufe style The set of habits and customs that b influenced, modified, encouraged, or 
constrained by the lifelong process of socialization. These habits and customs in¬ 
clude use of substances such as alcohol, tobacco, tea, coffee; dietary habits, exercise, 
etc., which have important implications for health and are often the subject of epi¬ 
demiologic investigations. 

ufe table A summarizing technique used to describe the pattern of mortality and 
survival in populations. The survival data are time specific and cumulative proba¬ 
bilities of survival of a group of individuals subject, throughout life, to the age- 
specific death rates in question. The life tabic method can be applied to the study 
not only of death, but also of any defined endpoint such as the onset of disease or 
the occurrence of specific compltcation(s) of disease. The survivors to age x are 
denoted by the symbol /„ the expectation of life at age x b denoted by the symbol 
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/ r and the proportion alive at age x who die between age x and x+1 yean ii de¬ 
noted by the symbol nq^ The life table method is used extensively in epidemiology 
and in many assessments of treatment regimens in clinical practice. 

The fint rudimentary life tables were published in 1693 by the astronomer Ed¬ 
mund Halley. These made use of records of the funerals in the city of Breslau. In 
1815 in England, the fint actuarially correct life table was published, based on both 
population and death data classified by age. 

Two types of life tables may be distinguished according to the reference year of 
the table: the current or period life table and the generation or cohort life table. 

The current life table is a summary of mortality experience over a brief period 
(one to three yean), and the population data relate to the middle of that period 
* (usually close to the date of a census). A current life table therefore represents the 
combined mortality experience by age of the population in a particular short period 
of time. 

The cohort or generation life table describes the actual survival experience of a 
group, or cohort, of individuals bom at about the same time. Theoretically, the 
mortality experience of the persons in the cohort would be observed from their 
moment of birth through each consecutive age in successive calendar yean until all 
of them die. 

The clinical life table describes the outcome experience of a group or cohort of 
individuals classified according to their exposure or treatment history. 

Life tables are also classified according to the length of age interval in which the 
data are presented. A complete life tabic contains dau for every single year of age 
from birth to the last applicable age. An abridged life table contains dau by inter¬ 
vals of five or ten years of age. See also expectation or life: survivorship studv. 
life table, expectation of Life FUNCTION, 4* (Syn: average future lifetime) The ex- 
pecution of life function is a sutement of the average number of years of life 
remaining to persons who survive to age x. 

life table, SURVIVORSHIP function, 4 The survivorship function is a sutement of 
the number of persons out of an initial population of defined size, e.g„ 100,000 
live births, who would survive or remain free of a defined endpoint condition to 
age x under the age-specific rates for the specified year. The value of for ex¬ 
ample, is determined by the cumulative operation of the specific death rates for all 
ages below 40. 

lifetime risr The risk to an individual that a given health effect will occur at any time 
after exposure, without regard for the time at which that effect occurs, 
urelihood function A function constructed from a sutistkal model and a set of ob¬ 
served dau, which gives the probability of the observed dau for various values of 
the unknown model parameters. The parameter values that maximize the proba¬ 
bility are the maximum likelihood estimates of the parameters. 
urelihood ratio test A statistical test based on the ratio of the maximum value of 
the likelihood function under one sutistkal model to the maximum value under 
another sutistkal model; the models differ in that one includes, the other excludes, 
one or more parameters. 

Lind, James (1716-1794) British naval surgeon; contributed to improved hygiene aboard 
ships. Conducted what amounted to epidemtologk experiments (albeit with small 
numbers) which esublished that scurvy could be prevented by fresh fruits such as 
lemons and oranges. 

unear model A sutistkal model in whkh the value of a parameter for a given value 
of a factor, a, is assumed to be equal to a 4* Ax, where o and b are constants. 


unear regression Regression analysis of dau using linear models. 

UNRAGE See genetic unkage; record linkage. 

UVE birth WHO definition adopted by Third World Health Assembly, 1950: Live 
birth is the complete expulsion or extraction from its mother of a product of con¬ 
ception, irrespective of the duration of the pregnancy, whkh, after such separation, 
breathes or shows any other evidence of life, such as beating of the heart, pulsation 
of the umbilical cord, or definite movement of volunury muscles, whether or not 
the umbilical cord has been cut or the placenu is attached; each product of such a 
birth is considered live born. 

In the Report of WHO Expert Committee on Prevention of Perinatal Mortality and Mor¬ 
bidity (Technical Report Series 457, 1970), it is noted that the above definition requires 
the inclusion as live births of very early and patently nonviabk fetuses and that 
accordingly it is not strictly applied. The committee suggested, therefore, that WHO 
should introduce a viability criterion into the definition so that very immature fe¬ 
tuses surviving for very short periods were excluded, even though they showed one 
or more of the transitory signs of life. 

LOCUS 

1. The position of a point, as defined by the coordinates on a graph. 

2. The position that a gene occupies on a chromosome. 

LOD SCORE In genetics, the log odds ratio of observed to expected distribution of ge- 
neiic markers. 

logistic model a sutistkal model of an individual s risk (probability of disease y) as a 
function of a risk factor x: 



where e is the (natural) exponential function. This model has a desirable range. 0 
to 1, and other attractive sutistkal features. In the multiple logistk model, the term 
fix is replaced by a linear term involving several factors, e.g., 0,x, +ftx> if there arc 
two factors x, and x». 

logit (Syn; log-odds) The logarithm of the ratio of frequencies of two different cate¬ 
gorical outcomes such as healthy versus skk. 

LOGIT MODEL A linear model for the logit (natural log of the odds) of disease as a 
function of a quantitative factor X: 

Logit (disease given X*x)-a + 0x 

This model is mathematically equivalent to the logistic model. 

loc-unear model A statistical model that uses an analysis of variance type of ap¬ 
proach for the modeling of frequency counts in contingency ublcs. 

log-normal distribution If a variable Y is such that log Y is normally distributed, 
it is said to have log-normal distribution. This is a skew distribution. See also 

NORMAL DISTRIBUTION. 

LONGITUDINAL STUDY See COHOBT STUDY. 

Louis, Pierre-Charles-Alexandre (1787-1872) French physkian and mathemati¬ 
cian. One of the founders of medical statistics, his research on tuberculosis, which 
included dissection of 358 specimens and study of I960 clinical cases, led to publi¬ 
cation of Recherches anatomicopothologiques sur la phthisie (Paris, 1825). This work and 
others are marked by rigorous numerical precision and demonstration of similari¬ 
ties and differences based upon numerkal distribution of dau. The Lilienfclds 1 
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have pointed out that Louis greatly influenced the development of statistics as ap¬ 
plied in biology and medicine; he either taught or otherwise directly influenced 
many European, British, and American workers, including William Farr, John Si¬ 
mon, William Augustus Guy, and William Budd in England, George Shattuck, Elisha 
Barnett, and Alonzo Clark in the United States, and Joseph Skoda in Hungary; 
those he influenced handed on these important concepts to their own pupils. 

■Ulienfeld AM. Lilienfeld D: Threads of epidemiological history, in Fevndalums 0 / Epulmhlogi. 

2nd Ed. (New York: Oxford. HWO). pp. 2S-45. 

LOW BIRTH WEIGHT See BIKTM WEIGHT. 

“lumping and sputtinc" Derisive term describing the propensity of epidemiologists 
to group related phenomena or to separate phenomena that hitherto have been 
grouped. Epidemiologists are sometimes called “lumpers and splitters." 
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malaria endemicity Certain terms used to describe the occurrence of malaria, based 
on enlarged spleen rates arc categorized by WHO as follows: 

1. Hypoendemtc: Spleen rate in children 2-9 years <10%. 

2. Mesoendemic: Spleen rate 11-50%. 

3. Hyperendemic: Spleen rate in children over 50%, in adults usually over 25%, 

A. Holoeiidentic: Spleen rate in children constantly over 75%, adult rate low. 

malaria periodicity Recurrence at regular intervals of symptoms: periodicity may be 
quotidian, tertian, or quartan, according to the interval between paroxysms: 

1. Quartan: Recurring every third day, i.e., day I, day 4, day 7, etc. 

2. Quotidian: Recurring daily. 

3. Tertian: Recurring every alternate day, i.e., day I, day 3 etc. 

malaria patent period Period during which parasites are present in peripheral blood. 

malaria reproduction rate Estimated number of malarial infections potentially dis¬ 
tributed by the average nonimmune infected individual in a community where nei¬ 
ther persons nor mosquitoes were previously infected. 

malaria survey Investigation in selected age-group samples in randomly selected lo¬ 
calities to assess malaria endemicity; uses spleen and/or parasite rates as measure of 
endemicity. 

Malt hits, Thomas Rorert (1766-1834) An English clergyman and natural scientist 
who argued in An Essay on the Principle of Population (London, 1798) that popula¬ 
tions increase in geometric progression while food supplies increase only in arith¬ 
metical progression, thus making famine inevitable. His work justifies his recogni¬ 
tion as one of the founders of demography, even though events proved his predictions 
wrong (at least in the short term). 

Manson, Patrick (1844-1922) Studied tropical diseases in China and made many con¬ 
tributions of fundamental importance, notably the transmission of filariasis by cul- 
kine mosquitoes, parts of the life cycle of schistosomes. He investigated and ob- 
served many other tropical parasitic diseases and founded the London School of 
(Hygiene and) Tropical Medicine in 1898. 

Mantel-Haenszel estimate, Mantel-Haenszel odds ratio Mantel and Haenszel’ 
provided an adjusted odds ratio as an estimate of relative risk that may be derived 
from grouped and matched sets of daia. It is now known as the Mantel-Haenszel 
estimate, one of the few eponymous terms of modern epidemiology. 

The staiisik may be regarded as a type of weighted average of the individual 
odds ratios, derived from stratifying a sample into a series of strata that are inter¬ 
nally homogeneous with respect to confounding factors. 

The Mantel-Haenszel summarization method can also be extended to the sum¬ 
marization of rate ratios and rate differences from rollow-up studies. 

1 Mantel N. Harnizel W: Statistical aspects of the analysis of data from retrospective studies of 

disease.,/ Natl Canter Inst 22:719-748, |959. 
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Mantel-Haenszel test a summary chi-square test developed by Mantel and 
Hacnszel for stratified data and used when controlling for confounding 
margin or safety An estimate of the ratio of the no-observed-effect level (NOEL) to 
the level accepted in regulations. 
marginals The row and column totals of a contingency table. 

Markov process A stochastic process such that the conditional probability distribution 
for the state at any future instant, given the present state, is unaffected by any 
additional knowledge of the past history of the system. 

MASKED STUDY See BUNDED STUDY. 

masking Proccdure(s) intended to keep participant(s) in a study from knowing some 
fact(s) or observation(s) that might bias or influence their actions or decisions re¬ 
garding the study. 

SNATCHED CONTROLS See CONTROLS, MATCHED. 

SNATCHING The process of making a study group and a comparison group comparable 
with respect to extraneous factors. Several kinds of matching can be distinguished: 

Caliper matching is the process of matching comparison group subjects to study 
group subjects within a specified distance for a continuous variable (e.g., matching 
age to within two years). 

Frequency matching requires that the frequency distributions of the matched 
variabte(s) be similar in study and comparison groups. 

Category matching is the process of matching study and control group subjects 
in broad classes such as relatively wide age ranges or occupational groups. 

Individual matching relies on identifying individual subjects for comparison, each 
resembling a study subject on the matched variab»e(s). 

Pair matching is individual matching in which study and comparison subjects are 
paired. 

MATERNAL mortality (rate) The risk of dying from causes associated with childbirth 
is measured by the maternal mortality rate. For this purpose the deaths used in the 
numerator are those arising during pregnancy or from puerperal causes, i.e.. deaths 
occurring during and/or due to deliveries, complications of pregnane)', childbirth, 
and the puerperium. Women exposed to the risk of dying from puerperal causes 
are those who have been pregnant during the period. Their number being un¬ 
known, the number of life births is used as the conventional denominator for com¬ 
puting comparable maternal mortality rates. The formula is 

Number of deaths from puerperal 
causes in a given geographic area 

Annual maternal _ _ during a given year _ x ioq qqo) 

mortality rate Number of live births that 

occurred among the population of 
the given geographic area during 
the same year 

There is variation in the duration of the postpartum period in which death may 
occur and be certified due to "puerperal causes," i.e. ( "maternal mortality." Accord¬ 
ing to WHO, a maternal death is defined as the death of a woman while pregnant 
or within 42 days of termination of pregnancy, irrespective of the duration and the 
site of pregnancy, from any cause related to or aggravated by the pregnancy or its 
management but not from accidental or incidental causes. 

Maternal deaths should be subdivided into two groups: (I) direct obstetric deaths, 
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resulting from obstetric complications of the pregnant state, and (2) indirect obstet¬ 
ric deaths, resulting from preexisting disease or conditions not due to direct obstet¬ 
ric causes. 

Although WHO defines maternal mortality as death during pregnancy or within 
42 days of delivery, in some jurisdictions, a period as long as a year is used. 

mathematical model A representation of a system, process, or relationship in math¬ 
ematical form in which equations are used to simulate the behavior of the system 
or process under study. The model usually consists of two parts: the mathematical 
structure itself, e.g., Newton’s inverse square law or Gauss’s "normal" law, and the 
particular constants or parameters associated with them, such as Newton’s gravita¬ 
tional constant or the Gaussian standard deviation. 

A mathematical model is deterministic if the relations between the variables in¬ 
volved take on values not allowing for any play or chance. A model is said to be 
statistical, stochastic, or random, if random variation is allowed to enter the picture. 
See also model. 

MAXIMUM ALLOWABLE CONCENTRATION (MAC) See SAFETY STANDARDS. 

maximum likelihood estimate The value for an unknown parameter that maximites 
the probability of obtaining exactly the data that were observed. 

McNemar’s test A form of the chi-square test for matched-pain data. It is a special 
case of the mantcl-haenszel test. 

mean, arithmetic A measure of central tendency. Calculable only for positive val¬ 
ues. It is calculated by taking the logarithms of the values, calculating their arith¬ 
metic mean, then converting back by taking the amilogarithm. 

mean, harmonic A measure of central tendencv computed by summing the recip¬ 
rocals of all the individual values and dividing the resulting sum into the number 
of values. 

MEASURE OF association A quantity that expresses the strength of association between 
variables. Commonly used measures of association are differences between means, 
proportions or rates, the rate ratio, the odds ratio, and correlation and regression 
coefficients. 

measurement The procedure of applying a standard scale to a variable or to a set or 
values. 

measurement, problems with terminology There is sometimes uncertainty about 
the terms used to describe the properties of measurement: accuracy, precision, va¬ 
lidity, reliability, repeatability, and reproducibility. Accuracy and precision are often 
used synonymously, validity is defined variously, and reliability, repeatability, and 
reproducibility are frequently used interchangeable. 

Etymologies are helpful in making a case for preferred usages, but they are not 
always derisive. Accuracy is from the Latin cum, care, and while this may be of 
interest to those in the health field, it does not illuminate the origins of the standard 
definition, that is, "conforming to a standard or a true value" (0£D). Accuracy is 
distinguished from precision in this way: A measurement or statement can reflect 
or represent a true value without detail. A temperature reading of 98.6*F is accu¬ 
rate, but it is not precise if a more refined thermometer registers a temperature of 
98,637*F. 

Precision (from Latin pmecidere, cut short)'is the quality of being sharply defined 
through exact detail. A faulty measurement may be expressed precisely, but may 
not be accurate. Measurements should be both accurate and precise, but the two 
terms are not synonymous. Consistency or reliability describes the property of mea¬ 
surements or results that conform to themselves. 
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Reliability (Latin religare, to bind) is defined by the OED as a quality that is sound 
and dependable. Its epidemiologic usage is similar; a result or measurement is said 
to be reliable when it is stable, i.e., when repetition of an experiment or measure¬ 
ment gives the same results. The terms "repeatability" and "reproducibility" are 
synonymous (the OED defines each in terms of the other), but they do not refer to 
a quality of measurement, rather only to the action of performing something more 
than once. Thus, a way of discovering whether or not a measurement is reliable is 
to repeal or reproduce it. The terms “repeatability" and “reproducibility," formed 
from their respective verbs, are used inaccurately when they arc substituted for 
“reliability," a noun that refers to the measuring procedure rather than the at¬ 
tribute being measured. However, in common usage, both repeatability and re- 
• producibilitv refer to the capacity of a measuring procedure to produce the same 
result on each occasion in a scries of procedures conducted under identical condi¬ 
tions. 

Validity is used correctly when it agrees with the standard definition given by the 
OED: “sound and sufficient." If, in the epidemiologic sense, a test measures what it 
purports to measure (it is sufficient) then the test is said to be valid. See also accu¬ 
racy; precision; reliability; repeatability; validity, 
measurement scale The complete range of possible values for a measurement (e.g., 
the set of possible responses to a question, the physically possible range for a set ol 
body weights). Measurement scales are sometimes classified into five major types, 
according to the quantitative character of the scale: 

1. Dichotomous scale: One that arranges items into either of two mutually exclusive 
categories. 

2. Kamtnal scale: Classification into unordered qualitative categories; e.g., race, 
religion, and country of birth as measurements of individual attributes are 
purely nominal scales, as there is no inherent order to their categories. 

3. Ordinal scale: Classification into ordered qualitative categories, e.g., social class 
(I. II, III etc.), where the values have a distinct order, but their categories arc 
qualitative in that there is no natural (numerical) distance between their pos¬ 
sible values. 

4. Interval nale: An (equal) interval involves assignment of values with a natural 
distance between them, so that a particular distance (interval) between two 
values in one region of the scale meaningfully represents the same distance 
between two values in another region of the scale. Examples include Celsius 
and Fahrenheit temperature, date of birth. 

5. Ratio nale: A ratio is an interval scale with a true zero point, so that ratios 
between values are meaningfully defined. Examples arc absolute temperature, 
weight, height, blood count, and income, as in each case it is meaningful to 
speak of one value as being so many times greater or less than another value. 

measures or central tendencv A general term for several characteristics of the dis¬ 
tribution of a set of values or measurements around a value or values at or near 
the, middle of the set. The principal measures of central tendency are the mean 
(average), median, and mode. (See entries under each.) 

MECHANICAL TRANSMISSION See VECTOR-BORNE INFECTION. 

median A measure of central tendencv. The simplest division of a set of measure¬ 
ments is into two parts—the lower and the upper half. The point on the scale that 
divides the group in this way is called the “median." 
mediator (mediating) variable See intermediate variable, 
medical audit A health service evaluation procedure in which selected data from pa* 
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lients' charts are summarized in tables displaying such data as average length of 
stay or duration of an episode of care, the frequency of diagnostic and therapeutic 
procedures, and outcomes of care arranged by diagnostic category. These are often 
compared with predetermined norms. 

medical care See health care. 

medical record A file of information relating to transaction(s) in personal health care. 
In addition to facts about a patient** illness, medical records nearly always contain 
other information. The full range of data in medical records includes the following: 

1. Clinical, i.e., diagnosis, treatment, progress, etc. 

2. Demographic, i.e., age, sex, birthplace, residence, etc. 

3. Sociocultural, i.e., language, ethnic origin, religion, etc. 

4. Sociological, i.e., family (next of kin), occupation, etc. 

5. Economic, i.e., method of payment (fcc-for service, indigent, etc.). 

6. Administrative, i.e., site of care, provider, etc. 

7. “Behavioral," e.g., record of broken appointment may indicate dissatisfaction 
with service provided. 

medical statistics See biostatistics. 

Mendel's laws Derived from the pioneering genetic studies of Gregor Mendel (1822- 
1884). Mendel's first law slates that genes are particulate units that segregate; i.e., 
members of the same pair of genes are never present in the same gamete, but 
always separate and pass to different gametes. Mendel’s second law states that genes 
assort independently; i.e., members of different pairs of genes move to gametes 
independently of one another. 

meta-analysis The process or using statistical methods to combine the results of dif- 
ferent studies. In the biomedical sciences, the systematic, organized and structured 
evaluation of a problem of interest, using information (commonly in the form of 
statistical tables or other data) from a number or independent studies of the prob¬ 
lem. A frequent application has been the pooling of results from a number of small 
randomized controlled trials, none in itself large enough to demonstrate statistically 
significant differences, but in aggregate, capable of so doing. Meta-analysis has a 
qualitative component, i.e., application or predetermined criteria of quality (e.g., 
completeness of data, absence of biases), and a quantitative component, i.e., inte¬ 
gration of the numerical information. Meta-analysis includes aspects of an overview, 
and of pooling of data, but implies more than either of these processes. Meta* 
analysis carries the risk of several biases. 

METHODOLOGY The scientific study of methods. Methodology should not be confused 
with methods. Sad to say, the word “methodology 1 * is all too often used when the 
writer means “method." 

MIASMA THEORY An explanation for the origin of epidemics, the “miasma theory" was 
implied by many ancient writers, and made explicit by Lancisi in De noxiis paludum 
efftuinis (1717). It was based on the notion that when the air was of a “bad quality" 

(a state that was not precisely defined, but that was supposedly due to decaying 
organic matter), the persons breathing that air would become ill. Malaria (“bad air") 
is the classic example of a disease that was long attributed to miasmata. “Miasma" 
was believed to pass from cases to susceptibles in these diseases considered coma- 
gious. 

migrant studies Studies taking advantage of migration to one country by those from 
other countries with different physical and biological environments, cultural back¬ 
ground and/or genetic makeup, and different morbidity or mortality experience. 
Comparisons are made between the mortality or morbidity experience of the mi« 
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grant groups with that of their current country of residence and/or their country 
of origin. Sometimes the experiences of a number of different groups who have 
migrated to the same country have been compared. 

Mux's canons In A System of Logie (1856), J^. Mill devised logical strategies (canons) 
from which causal relationships may be inferred. Four in particular are pertinent 
to epidemiology: the methods of agreement, difference, residues, and concomitant 
variation. 

Method of agreement (first canon): “If two or more instances of the phenomenon 
under investigation have only one circumstance in common, the circumstance in 
which alone all the instances agree, is the cause (or effect) of the given phenomc- 
non.** 

Method of difference (second canon): “If an instance in which the phenomenon 
under investigation occurs, and an instance in which it does not occur, have every 
circumstance in common save one, that one occurring only in the former, the cir¬ 
cumstance in which alone the two instances differ b the effect, or cause or a nec¬ 
essary part of the cause, of the phenomenon.“ 

Method of residues (fourth canon): “Subduct from any phenomenon such part as 
is known by previous inductions to be the effect of certain antecedents, and the 
residue of the phenomenon is the effect of the remaining antecedents." 

Method of concomitant variation (fifth canon): “Whatever phenomenon varies in any 
manner whether another phenomenon varies in some particular manner, is either 
a cause or an effect of that phenomenon, or is connected with it through some fact 
of causation." 

minimum data set (Syn: uniform basic data set) A widely agreed upon and generally 
accepted set of terms and definitions constituting a core of data acquired for med¬ 
ical records and employed for developing statistics suitable for diverse types of analyses 
and users. Such sets have been developed for birth and death certificates, ambula¬ 
tory care, hospital care, and long-term care. See also birth certificate; death 
certificate; hospital discharge abstract system, 
misclassification The erroneous classification of an individual, a value, or an at¬ 
tribute into a category other than that to which it should be assigned. The proba¬ 
bility of misclassification may be the same in all study groups (nondifferential mis- 
classification) or may vary between groups (differential misclassification). 
mobility, geographic Movement of persons from one country or region to another. 
mobility, social Movement from one defined socioeconomic group to another, either 
upward or downward. Downward social mobility, which can be related to impaired 
health (e.g., alcoholism, schizophrenia, or mental retardation), is sometimes re¬ 
ferred to as “social drift" 

mode One of the measures of central tendency. The most frequently occuring value 
in a set of observations. 

MODEL 

1. An abstract representation of the relationship between logical, analytical, or 

# empirical components of a system. See also mathematical model. 

2. A formalized expression of a theory or the causal situation that is regarded as 
having generated observed data. 

3. (Animal) model: an experimental system that uses animals, because humans 

• • cannot be used for ethical or other reasons. 

4. A small-scale simulation, e.g., by using an “average region" with characteristics 
resembling those of the whole country. 
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In epidemiology the use of models began with an effort to predict the onset and 
course of epidemics. In the second report of the Registrar-General of England and 
Wales (1840), william farr developed the beginnings of a predictive model for 
communicable disease epidemics. He had recognized regularities in the smallpox 
epidemics of the 1830$. By calculating frequency curves for these past outbreaks, 
he estimated the deaths to be expected. See also demonstration model; mathe¬ 
matical model; theoretical epidemiology. 

moderator variable (Syn: qualifier variable) In a study of a possible causal factor 
and an outcome, a moderator variable is a third variable exhibiting statistical inter¬ 
action by virtue of its being antecedent or intermediate in the causal process under 
study. If it is antecedent, it is termed a conditional moderator variable or effect 
modifier; if it is intermediate, it is a contingent moderator variable. See also inter¬ 
action; intermediate variable. 

MONITORING 

1. The performance and analysts of routine measurements, aimed at detecting 
changes in the environment or health status of populations. Not to be con¬ 
fused with surveillance. To some, monitoring also implies intervention in the 
light of observed measurements. 

2. Ongoing measurement of performance of a health service or a health profes¬ 
sional, or of the extent to which patients comply with or adhere to advice from 
health professionals. 

3. In management, the continuous oversight of the implementation of an activity 
that seeks to ensure that input deliveries, work schedules, targeted outputs, 
and other required actions are proceeding according to plan. 

monotonic sequence A sequence is said to be monotonic increasing if each value is 
greater than or equal to the previous one, and monotonic decreasing if each value 
is less than or equal to the previous one. If equality of values is excluded, we speak 
of a strictly (increasing or decreasing) monotonic sequence. 

Monte Carlo study, trial Complex relationships that are difficult to solve by math¬ 
ematical analysis are sometimes studied by computer experiments that simulate and 
analyze a sequence of events, using random numbers. Such experiments are called 
Monte Carlo trials, or studies, in recognition of Monte Carlo as one of the gambling 
capitals of the world. 

morbidity Any departure, subjective or objective, from a state of physiological or psy¬ 
chological well-being. In this sense, sickness, illness, and morbid condition are similarly 
defined and synonymous (but see disease). 

The WHO Expert Committee on Health Statistics noted in its Sixth report (1959) 
that morbidity could be measured in terms of three units: (1) persons who were ill; 
(2) the illnesses (periods or spells of illness) that these persons experienced; and (3) 
the duration (days, weeks, etc.) of these illnesses. See also health index; incidence 
rate; notifiable disease; prevalence rate. 

morbidity rate A term, preferably avoided, used indiscriminately to refer to incidence 
or prevalence rates of disease. 

morbidity survey A method for estimating the prevalence and/or incidence of disease 
or diseases in a population. A morbidity survey b usually designed simply to ascer¬ 
tain the facts as to disease distribution, and not to test a hypothesis. See also cross- 
sectional study; health survey. 

mortality rate See death rate. 

mortality STATISTICS Statistical tables compiled from the information contained in death 
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certificates. Most administrative jurisdictions in all nations produce ubles of mor¬ 
tality statistics. These may be published at regular intervals; they usually show num¬ 
bers of deaths and/or rates by age, sex, cause, and sometimes other variables. 

multicollinearity In multiple regression analysis, a situation in which at least some 
of ily independent variables arc highly correlated with each other. Such a situation 
can result in inaccurate estimates of the parameters in the regression model. 

MULTI FACTO RIAL ETIOLOGY See MULTIPLE CAUSATION. 

multinomial DISTRIBUTION The probability distribution associated with the classifica¬ 
tion of each of a sample of individuals into one of several mutually exclusive and 
exhaustive categories. When the number of categories is two. the distribution is 
called binomial. See also binomial distribution. 

■MULT1PHASIC SCREENING See SCREENING. 

multiple causation (Syn: multifactorial etiology) This term is used to refer to the 
concept that a given disease or other outcome may have more than one cause. A 
combination of causes or alternative combinations of causes may be required to 
produce the effect. 

MULTIPLE LOGISTIC MODEL See LOGISTIC MODEL. 

multiple risk Where more than one risk factor for the development of a disease or 
other outcome is present, and their combined presence results in an increased risk, 
we speak of ""multiple risk.** The increased risk may be due to the additive effects 
or the risks associated with the separate risk factors, or to synergism. 

multiplicative model A model in which the joint effect of two or more causes is the 
product of their effects. For instance, if factor a multiplies risk by the amount a in 
the absence of factor 6, and factor b multiplies risk by the amount b in the absence 
of factor a, the combined effect of factors a and b on risk is a x 6. See also additive 

MODEL. 

multistage model A mathematical model, mainly for carcinogenesis, based on the 
theory that a specific carcinogen may affect one of a number of stages in the de¬ 
velopment of cancer. 

multivariate analysis A set of techniques used when the variation in several vari- 
ables has to be studied simultaneously. In statistics, any analytic method that allows 
the simultaneous study of two or more dependent variables. 

mutation Heritable change in the genetic material not caused by genetic segregation 
or recombination, which is transmitted to daughter cells and to succeeding gener¬ 
ations, provided it is not a dominant lethal factor. 

mutation rate The frequency with which mutations occur per gene or per genera¬ 
tion. r 
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national death index A computerized central registry of deaths in the United States, 
started in 1979 and operated by the U.S. National Center for Health Stausiics, that 
facilitates mortality followup; cf. Canadian mortality data base, 
natural experiment A term probably derived from john snow's account of his inves¬ 
tigation of the practices of water supply companies in relation to the cholera epi¬ 
demics in London in the 1850s. It refers to naturally occurring circumstances in 
which populations have different exposures to a supposed causal factor in a situa¬ 
tion resembling an actual experiment in which persons would be assigned to groups. 

John Snow was able to trace the London outbreaks of cholera in the 19th century 
to water impurity as a result of comparisons made between two water companies. 
It would have been unethical to expose "test subjects** to infection, but the situation 
at the time afforded him the opportunity to make observations of crucial impor¬ 
tance. 

To turn this grand experiment to account, all that was required was to learn the 
supply of water to each individual house where a fatal attack of cholera might occur 
... I resolved to spare no exertion which might be necessary to ascertain the exact 
effect of the water supply on the progress of the epidemic, in the places where all 
the circumstances were so happily adapted for the inquiry ... 1 had no reason to 
doubt the correctness of the conclusions I had drawn from the great number of 
facts already in my possession, but I felt that the circumstances of the cholera-poisoning 
passing down the sewers into a great river, and being distributed through miles of 
pipes, and yet producing its specific effects was a fact of so startling a nature, and 
of so vast importance to the community, that it could not be too rigidly examined 
or established on too firm a basis. (Snow, On the Mode of Communication of Cholera. 
1855) 

natural history or disease The course of a disease from onset (inception) to reso¬ 
lution. Many diseases have certain well-defined stages that, taken all together, are 
referred to as the M natural history of the disease** in question. These stages are as 
follows: 

1. Stage of pathological onset 

2. Presymptomatic stage: from onset to the first appearance of symptoms and/or 
signs, screening tests may lead to earlier detection. 

3. Clinically manifest disease f which may progress inexorably to a fatal termina¬ 
tion, be subject to remissions and relapses, or regress spontaneously, leading 
to recovery. 

Detection and intervention can alter the natural history of disease. The term has 
also been used to mean descriptive epidemiology of disease." 
natural history study A study, generally longitudinal, designed to yield information 
about the natural course of a disease or condition. 
natural rate or increase (decrease) See growth rate of fofulation. 
nearest neighbor method A means of analyzing the spatial patterns of a free-living 
population. A term from veterinary epidemiology. Random sampling points are 
located throughout an area and the distance from each point to the nearest individ¬ 
ual is measured; alternatively, individuals are selected at random and from each of 
these the distance to the nearest neighbor is measured. 
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necessary and SUFFICIENT cause A causal factor whose presence » required for the 
occurrence of the efTect and whose presence is always followed by the effect. See 
also association; causality. 

needs (Syn: health needs, perceived needs, professionally defined needs, unmet needs) 
This term has both a precise and an all-but-indefinable meaning in the context of 
public health. We speak of needs in precise numerical terms when we refer to 
specific indicators of disease or premature death that require intervention because 
their level is above that generally accepted in the society or community in question. 
For example, an infant mortality rate two or three times greater than the national 
average in a particular community b an indicator of unmet health needs of infants 
in that community (not to be confused with a need for more or better medical care). 
It should be clear that even in this seemingly precise usage there are implied value 
judgments. It must be explicitly stated that M nccds H always reflect prevailing value 
judgments as well as the existing ability to control a particular public health prob¬ 
lem. Thus, sputum-positive pulmonary tuberculosis was not recognized as a health 
need in 1850 but was by 1900 in the industrialized nations; the ill effects of ciga¬ 
rette smoking must now be universally acknowledge as a health need; and child 
abuse is increasingly regarded as a public health problem, to which we could apply 
the term "professionally defined need*** 

(See Vickers GR: What sets the goals of public health. Lancet 1:599, 1958.) 

NEONATAL MORTALITY RATE 

1 . In vital statistics, the number of deaths in infants under 28 days of age in 
a given period, usually a year, per 1000 live births in that period. 

2. In obstetric and perinatal research the term "neonatal mortality rate" is often 
used to denote the cumulative mortality rate of live-bom infants within 28 
days of age. 

nested case control STUDY A case control study in which cases and controls are drawn 
from the population in a cohort study. As some data are already available about 
both cases and controls, the effects of some potential confounding variables are 
reduced or eliminated. 

net migration The numerical difference between immigration and emigration. 

net migration rate The net effect of immigration and emigration on an area's pop¬ 
ulation expressed as an increase or decrease per 1000 population of the area in a 
given year. 

net reproduction rate The average number of female children bom per woman in 
a cohort subject to a given set of age-specific fertility rates, a given set of age-specific 
mortality rates, and a given sex ratio at birth. This rate measures replacement fer¬ 
tility under given conditions of fertility and mortality: it is the ratio of daughters to 
mothers assuming continuation of the specified conditions of fertility and mortality. 
It is a measure of population growth from one generation to another under con- 
sunt conditions. This rate is similar to the gross reproduction rate, but takes into 
account that some women will die before completing their childbearing years. An 
NRR of 1.00 means (hat each generation of mothers is having exactly enough 
daughters to replace itself in the population. See also cross reproduction rate. 

New York State Identification and Intelligence System (NYSIIS) A method of 
identifying individuals for record linkage based on phonetic spelling of full names, 
sequence of digits for birthdatc, birthplace, sex, name at birth, and parents' names. 

' See also hocben number; soundex cooe. 

nidus A focus of infection. The term can be used to describe any heterogeneity in the 
distribution of a disease, but is usually applied to a small area in which conditions 


favor occurrence and spread of a communicable disease; also, the site of origin of 
a pathological process. 

Nightingale, Florence (1820-1910) An English woman who is identified as the founder 
of modern nursing, but was much more. In addition to her famous work of elevat¬ 
ing nursing to a noble profession during the Crimean War, and esublishing a train¬ 
ing school for nurses at Si. Thomas’s Hospiul in London, she recognized the im- 
poruncc of sutistical analysis of hospiul records (Notes an Hospitals London: 
Longmans, 1859); her contributions were recognized by election to Fellowship or 
the Royal Sutistical Society. Her best-known work is Notes an Nursing (I860). 

noise (in data) This term is used when extraneous uncontrolled variables and/or er¬ 
rors influence the distribution of measurements that are made in a study, thus 
rendering difficult or impossible the determination of relationships between vari¬ 
ables under scrutiny. 

nomenclature A list of all approved terms for describing and recording observations. 

NOMINAL SCALE See MEASUREMENT SCALE. 

nomogram A form of line chart showing scales for the variables involved in a particular 
formula in such a way that corresponding values for each variable lie on a straight 
line intersecting all the scales. 
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Nomogram of confidence limits to a rate. 

From Rosenbaum, Nomograms for rales per 1000, BrMedJ 1:169-170, 1963, 
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NONCONCURRENT STUDY See HISTORICAL COHORT STUDY. 

NONDIFFERENTIAL M^CLASSIFICATION See MISCLASSIFICATION. 

NONEXPERIMENTAL STUDY See OBSERVATIONAL STUDY. 

NONPARAftCmUC METHODS SCC DISTRIBUTION-FREE METHOD. 

NONPARAMETRIC TEST SCC DISTRIBUTION-FREE METHOD. 

nonparticipants (Syn; nonresponders) Members of a study sample or population who 
do not take part in the study for whatever reason, or members of a target popula¬ 
tion who do not participate in an activity. Differences between participants and 
non participants have been demonstrated repeatedly in studies of many kinds, and 
this is often a source of bias. 

no-oeserved-effect level (noel) A term from toxicology, meaning the highest dose 
at which no adverse health effects are detected in an animal population. A NOEL* 
SF is a no-observed-effects level with an added safety factor for human exposures, 
used in setting human safety standards. 

norm This term has two quite distinct meanings; 

1. The first is "what is usual,” e.g., the range into which blood pressure values 
usually fall in a population group, the dietary or infant feeding practices that 
are usual in a given culture, or the way that a given illness is usually treated 
in a given health care system. 

2. The second sense is “what is desirable," e.g., the range of blood pressures that 
a given authority regards as being indicative of present good health or as 
predisposing to future good health, the dietary or infant feeding practices that 
are valued in a given culture, or the health care procedures or facilities for 
health care that a given authority regards as desirable. 

In the Utter sense, norms may be used as criteria when evaluating health care, in 
order to determine the degree of conformity with what is desirable, the average 
length of stay of patients in hospital, etc. A distinction is sometimes made between 
norms, defined as quantitative indexes based on research, and standards, which are 
fixed arbitrarily. 

NORMAL This term has three distinct meanings. Conceptual difficulties may arise if these 
different meanings are not specified, or if the area of their overUp is not clearly 
understood. 

1. Within the usual range of variation in a given popubtkm or population group; 
or frequently occurring in a given population or group. In this sense, "nor¬ 
mal" is frequently defined as, "within a range extending from two standard 
deviations below the mean to two standard deviations above the mean,** or 
"between specified (e.g., the I Oth and 90th) percentiles of the distribution." 

2. In good health, indicative or predictive of good health, or conducive to good 
health. For a diagnostic or screening test, a "normal" result is one in a range 
within which the probability of a specific disease is low (see also normal lim¬ 
its). 

3. (Of a distribution) Gaussian; see also normal distribution. 

normal distribution (Syn: Gaussian distribution) The continuous frequency distri¬ 
bution of infinite range represented by the equation 
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Normal distribution of heart rate. From Rimm et al., 1980. 

The properties of a normal distribution include the following: (I) It is a contin¬ 
uous, symmetrical distribution; both tails extend to infinity; (2) the arithmetic mean* 
mode, and median are identical; and (3) its shape is completely determined by the 
mean and standard deviation. 

normal limits The limits of the "normal" range of a test or measurement, in the sense 
of being indicative of or conducive to good health. One way to determine normal 
limits is to compare the values obtained when the measurements are made in two 
groups, one that is healthy and has been found to remain healthy, the other ill, or 
subsequently found to become ill. The result may be two overlapping distributions, 
as illustrated. Outside the area where the distributions overlap, a given value clearly 
identifies the presence or absence of disease or some other manifestation of poor 
health. If a value falls into the area of overlap, the individual may belong to either 
the normal or the abnormal group. The choice of the normal limits depends upon 
the relative importance attached to the identification of individuals as healthy or 
unhealthy. See also false negative; false positive; sensitivity and specificity. 
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Hypothetical distribution of normal and diabetic glucose levels. 
From Lilienfekl and Lilienfeld, 1979. 
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where x is the abscissa,/(x) is the ordinate, §i is the mean, / U the natural logarithm, 
2.718 and <r the standard deviation. 


normative Pertaining to the normal, usual, accepted standard or values. See also norm, 
nosocomial Arising while a patient is in a hospital or as a result of being in a hospital; 
relating to a hospital; denoting a new disorder (unrelated to the patient's primary 
condition) associated with being in a hospital. 
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nosocomial infection (Syn: hospital-acquired infection) An infection originating in a 
medical facility, e.g., occurring in a patient in a hospital or other health care facility 
in whom the infection was not present or incubating at the time of admission. In* 
dudes infections acquired in the hospital but appearing after discharge; it also in* 
eludes such infections among staff. 

nosography, nosology Classification of ill persons into groups, whatever the criteria 
for their classification, and agreement as to the boundaries of the groups, is called 
"nosology." The assignment of names to each disease entity in the group results in 
a nomenclature of disease entities, or nosography. (Faber K: Nosography in Modem 
Internal Medtane. New York; Hoeber, 1923.) 

notifiable disease A disease that, by statutory requirements, must be reported to the 
public health authority in the pertinent jurisdiction when the diagnosis is made. 

A disease deemed of suffident importance to the public health to require that its 
occurrence be reported to health authorities. 

The reporting to public health authorities of communicable diseases is, unfortu¬ 
nately, very incomplete. The reasons for this include diagnostic inexactitude; the 
desire of patients and physicians to conceal the occurrence of conditions carrying a 
social stigma, e.g., sexually transmitted diseases; and the indifference of physicians 
to the usefulness of information about such diseases as hepatitis, influenza, and 
measles. Yet notifications are extremely important. They provide the suiting point 
for investigations into the failure of preventive measures such as immunizations, 
for tracing sources of infection, for finding common vehicles of infection, for de¬ 
scribing the geographic clustering of infection, and for various other purposes, de¬ 
pending upon the particular disease. 

N*s«, D4. Abbreviation, usually written lower case, for not statistically significant. 

NULL hypothesis (Syn: lest hypothesis) The sutistical hypothesis that one variable has 
no association with another variable or set of variables, or that two or more popu¬ 
lation distributions do not differ from one another. In simplest terms, the null 
hypothesis sutes that the results observed in a study, experiment, or test are no 
different from what might have occurred as a result of the operation of chance 
alone. 

numerator The upper portion of a fraction used to calculate a rate or a ratio. 

numerical taxonomy The construction of homogeneous groupings or tax« using nu¬ 
merical methods; allied to cluster analysis. 
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observational study (Syn: nonexpcrimental study, survey) Epidemiologic study in 
situations where nature is allowed to take its course; changes or differences in one 
characteristic are studied in relation to changes or differences in otherfs), without 
the intervention of the investigator. 

observer variation (error) Variation (or error) due to failure of the observer to 
measure or to identify a phenomenon accurately. Observer variation erodes scien¬ 
tific credibility whenever it appears. Sir Thomas Browne in Pseudodoxia Efndemita 
(1646), subtitled "Enquiries into very many commonly received lenents and pre¬ 
sumed truths," recognized several sources of error: "the common infirmity of hu¬ 
man nature, the erroneous disposition of the people, misapprehension, fallacy or 
false deduction, credulity, obstinate adherence to authority, the belief in popular 
conceits, the endeavours of Satan." 

AH observations are subject to variation. Discrepancies between repeated obser¬ 
vations by the same observer and between different observers are to be expected; 
these can be diminished but probably never absolutely eliminated. 

Variation may arise from several sources. The observer may miss an abnormality 
or think he has found one where none is present; a measurement or a test may 
give incorrect results due to faulty technique or incorrect reading and recording of 
the results; or the observer may misinterpret the information. Two varieties of ob¬ 
server variation are interobserver variation, i.e., the amount observers vary from 
one another when reporting on the same material, and intraobserver variation, the 
amount one observer varies between observations when he reports more than once 
on the same material. 

Occam's Razor (Syn: scientific parsimony) William of Occam’s 14th century dictum 
was that, “the assumptions introduced to explain a thing must not be multiplied 
beyond necessity." This useful maxim does not contradict the conclusion that mul¬ 
tiple causes operate in any system. The number of causes implicated depends on 
the frame of reference of the investigator and on the scope of the inquiry. 

occurrence (Syn: frequency) In epidemiology, a general term describing the fre¬ 
quency of a disease or other attribute or event in a population without distinguish¬ 
ing between incidence and prevalence. 

odds The ratio of the probability of occurrence of an event to that of nonoccurrence, 
or the ratio of the probability that something is so, to the probability that it is not 
so. If 60 smokers develop a chronic cough and 40 do not, the odds among these 
100 smokers in favor of developing a cough are 60:40, or 1.5; this may be con¬ 
trasted with the probability that these smokers will develop a cough, which is 60/ 
100 or 0.6. 

odds ratio (Syn: cross-product ratio, relative odds) The ratio of two odds. The term 
"odds" is defined differently according to the situation under discussion. Consider 
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the following notation for the distribution of a binary exposure and a disease in a 
population or a sample. 

Exposed Unexposed 
Disease a b 

No disease c d 

The odds ratio (cross-product ratio) is adlbc. 

The exposure-odds ratio for a set of case control data is the ratio of the odds in 
favor of exposure among the cases (alb) to the odds in favor of exposure among 
. noncases (cld). This reduces to adlbc. With incident cases, unbiased subject selection, 
and a "rare" disease (say, under 2% cumulative incidence rate over the study pe¬ 
riod), adlbc is an approximate estimate of the risk ratio. With incident cases, un¬ 
biased subject selection, and density sampling of controls adlbc is an estimate of 
the ratio of the person-time incidence rates ( forces of morbidity) in the exposed 
and unexposed (no rarity assumption is required for this). 

The disease-odds (rate-odds) ratio for a cohort or cross section is the ratio of the 
odds in favor of disease among the exposed (ale) to the odds in favor of disease 
among the unexposed bid). This reduces to adlbc and hence is equal to the expo¬ 
sure-odds ratio for the cohort or cross section. 

The prevalence-odds ratio refers to an odds ratio derived cross sectkmally, as, for 
example, an odds ratio derived from studies of prevalent (rather than incident) 
cases. 

The risk-odds ratio is the ratio of the odds in favor of getting disease, if exposed, 
to the odds in favor of getting disease if not exposed. The odds ratio derived from 
a cohort study is an estimate of this. See also case control study. 

one-tail test A statistical significance lest based on the assumption that the data have 
only one possible direction of variability. 

operational research The systematic study, by observation and experiment, of the 
working of a system, c.g., health services, with a view to improvement. 

OPERATIONS RESEARCH 

1. The fitting of models to data, or the designing of models. 

2. Synonym for operational research. 

opportunistic infection Infection with organism(s) that are normally innocuous, e.g., 
commensals in the human, but become pathogenic when the body's immunologic 
defenses are compromised, as happens in the acquired immunodeficiency syn¬ 
drome (AIDS). 

ORDINAL SCALE See MEASUREMENT SCALE. 

ordinate The distance of a point, P, from the horizontal or x axis of a graph, mea¬ 
sured along the vertical or y axis. See also abscissa; graph. 

outcomes All the possible results that may stem from exposure to a causal factor, or 
from preventive or therapeutic interventions; all identified changes in health status 
arising as a consequence of the handling of a health problem. See also causality; 

CAUSATION OF DISEASE, FACTORS IN. 

outliers Observations differing so widely from the rest of the data as to lead one to 
suspect that a gross error may have been committed, or suggesting that these values 
come from a different population. 

outbreak (Syn: epidemic) Sometimes the preferred word, as it may escape sensation- 
* alism associated with the word epidemic. Alternatively, a localized as opposed to 
generalized epidemic. 

output The immediate result of professional or institutional health care activities, usu- 




93 overwintering 

ally expressed as units of service, e.g., patient hospital days, outpatient visits, labo¬ 
ratory tests performed. 

overmatching A situation that may arise when groups are matched. Several varieties 
can be distinguished: 

1 . The matching procedure partially or completely obscures evidence of a true 
causal association between the independent and dependent variables. Over¬ 
matching may occur if the matching variable is involved in, or is closely con¬ 
nected with, the mechanism whereby the independent variable affects the de¬ 
pendent variable. The matching variable may be an intermediate cause in the 
causal chain or it may be strongly affected by, or a consequence of, such an 
intermediate cause. 

2. The matching procedure uses one or more unnecessary matching variables, 
e.g., variables that have no causal effect or influence on the dependent vari¬ 
able, and hence cannot confound the relationship between the independent 
and dependent variables. 

3. The matching process b unduly elaborate, involving the use of numerous 
matching variables and/or insisting on very close similarity with respect to spe¬ 
cific matching variables. This leads to difficulty in finding suitable controls. 
See also matching. 

overwintering See vector-borne infection. 
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P 9 P (probability) valuc The probability that a test statistic would be as extreme as 
or more extreme than observed if the null hypothesis were true. The letter P, fol¬ 
lowed by the abbreviation n.s. (not significant) or by the symbol < (less than) and a 
decimal notation such as 0.01, 0.05, is a statement of the probability that the differ¬ 
ence observed could have occurred by chance, if the groups are really alike, i.e., 
under the null hypothesis. 

Investigators may arbitrarily set their own significance levels, but in most biomed¬ 
ical and epidemiologic work, a study result whose probability value is less than 5% 
(P<0.05) or 1% (P<0.01) is considered sufficiently unlikely to have occurred by 
chance to justify the designation Statistically significant.** See also statistical sic- 

NIPICANCE. 

paired samples In a clinical trial, pin of subject ptients may be studied. One 
member of each pair receives the exprimcntal regimen, and the other receives a 
suitably designated control regimen. Pairing should be based on a prognostic vari¬ 
able such as age 

Pairing may similarly be used in a case control study or in a cohort study. 
See also matching. 

pandemic An epidemic occurring over a very wide area and usually affecting a large 
proportion of the population. 

panel study A combination of cross-sectional and cohort methods, in which the inves¬ 
tigator conducts a series of cross-sectional studies of the same individuals or study 
sample. This method of study prmits changes in one variable to be related to 
changes in other variables. See also nested case-control study. 

Panum, Peter Ludwig (1820-1885) A Danish physician who observed firsthand an 
epidemic of measles in the Faroe Islands in 1846. This was the first outbreak there 
for many years, and from the epidemic puera, Panum deduced some basic, pre¬ 
viously unknown details about the method of spread, and incubation priod, the 
lasting immunity that followed infection, and the relationship between age and se¬ 
verity of infection. 

paradigm A typical example, a pttem of thought or conceptualization; an overall way 
of regarding phenomena, within which scientists normally work. A paradigm may 
dictate what form of explanation will be found acceptable, but a science may change 
pradigms. In many contexts in which it is used, the term is both ambiguous and 
vague. 1 The word is often used loosely as a synonym for "factor" or "variable." 

1 Kuhn T. The Structure of Scientific Revolutions. Chicago: University of Chicago Press, 1962. 

parameter In mathematics, a constant in a formula or model; in statistics and eptde- 
’ miology, a measureable characteristic of a population. 

parametric test A statistical test that depnds upon assumption(s) about the distri¬ 
bution of the data, e.g., that these are normally distributed. 
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pthogenicity 

parasite An animal or vegetable organism that lives on or in another and derives its 
nourishment therefrom. An obligate prasite is one that cannot lead an indepn- 
dent nonprasitic existence. A facultative prasite is one that is capble of either 
prasitic or indepndent existence. 

parasite count See worm count. 

parasite density The collective degree of prasitemia in a population, calculated by 
the use of either the geometric mean or the weighted average of the individual 
prasite counts; e.g., by using a frequency distribution based on a geometric pro¬ 
gression. 

paratenic host (Syn: transport host) A second, third, or subsequent intermediate host 
of a prasite, in which the prasite does not undergo any development or replica¬ 
tion, but remains, usually encysted, until the pratenic host is ingested by the defin¬ 
itive host of the prasite. 

partty The status of a woman as regards the fact of having borne viable children. The 
number of full-term children previously borne by a woman, excluding miscarriages 
or abortions in early pregnancy, but including stillbirths. 

particularization A method of analysis opposite to generalization or abstraction. It 
focuses on the specificity of a number of facts and illustrates an issue through the 
use of example. 

passage The transfer of micro-organisms from human to animal host(s) either directly 
or via laboratory culture; in the laboratory, this procedure is used to establish the 
Henle-Koch postulates. 

passenger variable A variable that varies systematically with the depndent variable 
under study, without being causally related to it; a third (explanatory) variable, the 
common cause of both the depndeni and the passenger variable, "explains" or 
accounts for their association. 

passive smoking See involuntary smoking 

Pasteur, Louis (1822-1895) A French chemist and biologist. One of the founders of 
bacteriology and therefore an important figure also in epidemiology. Starting in 
chemistry , he worked out the biological basis for fermentation, and then went on 
to make many important discoveries in bacteriology, notably vaccines against an¬ 
thrax and rabies. He is, of course, cponymously honored by the word "psteuriza- 
tion." 

path analysis A mode of analysis involving assumptions about the direction of causal 
relationship between linked sequences and configurations of variables. This pr- 
mils the analyst to construct and test the appropriateness of alternative models (in 
the form of a pth diagram) of the causal relations that may exist within the array 
of variables included in the finite system studied. Identification of the less probable 
sequences of causal pthways may prmil them to be eliminated from further con¬ 
sideration. 

pathogen Organism capble of causing disease (literally, causing a pthotogical pro* 
cess). 

pathogenesis The postulated mechanisms by which the etiologic agent produces dis¬ 
ease. The difference between etiology and pthogenesis should be noted: The 
etiology of a disease or disability consists of the postulated causes that initiate the 
pathogenetic mechanisms; control of these causes might lead to prevention of the 
disease. 

pathogenicity The proprty of an organism that determines the extent to which overt 
disease is produced in an infected population, or the power of an organism to 
produce disease. Also used to describe comparable propitks of toxic chemicals, 
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etc. Pathogenicity of infectious agents is measured by the ratio of the number of 
persons developing clinical illness to the number exposed to infection. See also vir¬ 
ulence. with which pathogenicity is sometimes confused. 

Pearson, Karl (1857-1936) British mathematician, biologist and geneticist. Pearson 
was f pupil of Francis Gallon, who led the science of statistics further into applica¬ 
tions in biology and genetics. He founded the journal Biometrika, coined the word 
"biometry," and taught the next generation of sutistician/epidemiologists, including 
Major Greenwood, Raymond Pearl, and others. 

PEARSON'S PRODUCT MOMENT CORRELATION Sec CORRELATION COEFFICIENT. 

pedigree A diagram showing the ancestral relationships and transmission of genetic 
trails over several generations of a family. 

peer review Process of review of research proposals, manuscripts submitted for pub¬ 
lication, abstracts submitted for presentation at scientific meetings, whereby these 
are judged for scientific and technical merit by other scientists in the same field. 
Also refers to review of clinical performance, when it h a form of medical audit. 

penetrance The frequency, expressed as a percentage, with which individuals of a 
given phenotype manifest at least some degree of a specific mutant phenotype as¬ 
sociated with a trait. See also genetic penetrance. 

perceived need A felt need. The term usually refers to need for health care that is 
felt by the person or community concerned, but which may not be perceived by 
health professionals. 

percentile The set of divisions that produce exactly 100 equal parts in a series of 
continuous values, such as children's heights or weights. Thus a child above the 
. 90th percentile has a greater value for height or weight than over 90% of all in the 
series. 

perinatal mortality Literally, mortality around the time of birth. Conventionally this 
time is limited to the period between 28 weeks gestation and one week postnatal. 
However, as the following discussion indicates, other factors, especially the weight 
of the fetus, should be considered. The Ninth (1975) Revision of the International 
Classification of Diseases includes the following: 

Perinatal mortality statistics 

It is recommended that national perinatal statistics should include all fetuses and 
infants delivered weighing at least 500 g (or, when birth weight is unavailable, the 
corresponding gestational age (22 weeks) or body length (25 cm crown-heel]), 
whether alive or dead. It is recognized that legal requirements in many countries 
may set different criteria for registration purposes, but it is hoped that countries 
will arrange the registration or reporting procedures in such a way that the events 
required for inclusion in the statistics can be identified easily. It is further recom¬ 
mended that less mature fetuses and infants should be excluded from perinatal 
statistics unless there are legal or other valid reasons to the contrary. 

It is recommended above that national statistics would include fetuses and infants 
weighing between 500 g and 1000 g both for their inherent value and because their 
inclusion improves the completeness of reporting at 1000 g and over. 

Inclusion of this group of very immature births, however, disrupts international 
comparisons because of differences in national practices concerning their registra¬ 
tion. Another factor affecting international comparisons is that all live-born infants, 
irrespective of birth weight, are included in the calculation or rates, whereas some 
lower limit of maturity is applied to infants born dead. 

In order to eliminate these factors, it is recommended that countries should pres- 




person-time incidence rate 


ent, solely for international comparisons, "standard perinatal statistics" in which 
both the numerator and denominator of all rales are restricted to fetuses and in¬ 
fants weighing 1000 g or more (or, where birth weight is unavailable, the corre¬ 
sponding gestational age (28 weeks] or body length (25 cm crown-heel]). 
perinatal mortality rate In most industrially developed nations, this is defined as 


Perinatal 
mortality rate 


Fetal deaths (28 weeks* of 
gestation) + postnatal 
, deaths (first week) 

Fcul deaths (28 weeks+ of 
gestation) + live births 


x 1000 


The World Health Organization's definition, more appropriate in nations with less 
well-established vital records, is 


Late fcul deaths (28 
weeks + of gestation) + 

Perinatal _ postnatal deaths (first week) 
mortality rale Live births in a year 

Note the differences in denominator of the perinatal mortality rate as defined by 
WHO and in industrially developed nations. This makes international comparison 
difficult. The WHO Expert Committee on the Prevention of Perinatal Mortality 
and Morbidity (1970) recommended a more precise formulation: "Late fetal and 
early neonatal deaths weighing over 1000 g at birth expressed as a ratio per 1000 
live births weighing over 1000 g at birth." 

periodic (medical) eeaminations Assessment of health status conducted at predeter¬ 
mined intervals, e.g., annually or at specified milestones in life such as infancy, 
school entry, preemployment, or preretirement. This form of medical examination 
generally follows a formal protocol, e.g., employing a set of structured questions 
and/or a predetermined set of laboratory tests. 

PERIOD OF COMMUNICABILITY See COMMUNICABLE PERIOD. 

permissible exposure limit (pel) An occupational health standard to safeguard em¬ 
ployees against dangerous chemicals or contaminants in the workplace. See safety 

STANDARDS. 

personal health care Those services lo individuals that are performed on a one-to- 
one basis by a health care worker for the purpose of maintaining or restoring health, 

personal monitoring device An instrument attached to a person to measure the ex¬ 
posure of that person to hazardous substancc(s), 

person-time A measurement combining persons and time, used as denominator in 
person-time incidence and mortality rates. It is the sum of individual units of time 
that the persons in the study population have been exposed to the condition of 
interest. A variant is person-distance, e.g., as in passenger-kilometers. The most 
frequently used person-time is person-years. With this approach, each subject con¬ 
tributes only as many years of observation to the population at risk as he is actually 
observed; if he leaves after one year, he contributes one person-year; if after ten, 
ten person-years. The method can be used to measure incidence over extended and 
variable time periods. 

person-time incidence rate (Syn: interval incidence density) A measure of the inci¬ 
dence rate of an event, e.g., a disease or death, in a population at risk, riven bv 
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Number of events occurring during the interval 
Number or person-lime units at risk observed 
during the interval 


person-to-person spread op dueaie (Syn: prosodcmic) See transmission or infec¬ 
tion. 

PERSON-YEARS See PERSON-TIME. 

Petty, William (1623-1687) A member of the same circle as John Graunl, he is equally 
recognized as a pioneer in vital statistics and economics. His ideas and concepts of 
lifetime earning capability are contained in Poiitical Arithmetic (London, 1691). 

pharmacoepidemiology The study of the distribution and determinants of drug-related 
events in populations, and the application of this study to efficaceous drug treat¬ 
ment. 

physician (Syn: medical practitioner, doctor) Professional person qualified by educa¬ 
tion and authorized by law to practice medicine. 

pie chart A circular diagram divided into segments, each representing a category or 
subset of data. The amount for each category is proportional to the angle sub¬ 
tended at the center of the circle and hence to the area of the sector. 

When several pie charts are used to describe several populations, the area of each 
circle is proportional to the size of the population it represents. 

pilot investigation, study A small-scale test of the methods and procedures to be 
used on a larger scale if the pilot study demonstrates that these methods and pro¬ 
cedures can work. 

placebo, placebo EFFECT An inert medication or procedure. The placebo effect (usu¬ 
ally but not necessarily beneficial) is attributable to the expectation that the regimen 
will have an effect, i.e., the effect is due to the power of suggestion. See also halo 
effect. 

POINT SOURCE EPIDEMIC See EPIDEMIC, COMMON SOURCE. 

Poisson distribution A distribution function used to describe the occurrence of rare 
events or to describe the sampling distribution of isolated counts in a continuum of 
time or space (e.g., sample counts of radioactive disintegration per minute). The 
number of events has a Poisson distribution with parameter X (lambda) if the prob¬ 
ability of observing k events (WO, I,. . .) is equal to 
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where e is the base of natural logarithm, 2.7183. . . . The mean and variance of 
the distribution are both equal to X. This distribution is used in modeling person- 
time incidence rates. 

pollution Any undesirable modification of air, water, or food by iubsunce(s) that are 
toxic or may have adverse effects on health or that are offensive though not nec¬ 
essarily harmful to health. 

polygenic inheritance The transmission of a phenotypic trait whose expression de- 
. pends upon the additive effect of a number of genes. 

ponderal index The anthropometric index of body mass. Defined as height divided 
by the cube root of the body weight. The body mass index is generally regarded as 
a belter index of body mass. 


• • 

| 99 popuUtioo attributable Hilt 
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** A 1 ! l ,l J c mtoWunt* of a given country or area considered together; the number 
of inhabitants of a given country or area. . 

2. (In sampling) The whole collection of units from which a sample may be drawn; 
not necessanly a population of persons; the units may be institutions, records, 
or events. The sample is intended to give results that are representative of the 
whole population. 

population ATmiairtAaLE nin (ran) This term is used by many epidemiologists'** 
m preference to the terms “attributable fraction (population)" or "etblogic fraction 
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population attributable risk percent 100 

(population).** It is the incidence of a disease in a population that is associated with 
(attributable to) exposure to the risk factor. It is often expressed as a percentage. 
It is calculated by similar methods to those described for attributable fraction (pop¬ 
ulation), i.e., 

PAR* - P ' ( J' - /ai x 100 

where P t - number of persons exposed 
P t « persons in the population 
l t * incidence rate among the exposed 
/* - incidence rate among the unexposed 
/, - incidence rate for the total population 

In a case-control study, PAR can be estimated in various ways; Cole and Mac- 
Mahon 9 give the following formula: 


PAR% - 


PrM-\) 

1+/M/M-1) 


x 100 


where P t * proportion of controls exposed 

RR - relative risk for exposed, compared to risk of I for the unexposed. 
’MacMahon B. Pugh TF: Efndrmtoiogy: Principle* end Method*. Boston: Little, Brown. 1970. 
'Fletcher RH. Fletcher SW, Wagner EH: Clmicnt Eptdeniologt—iHe Essentials. Baltimore: Williams 


* Wilkins, 1982. 

’Cole P. MacMahon B: Attributable risk percent in case-control studies. BritJ Prev Soe Med 25:242- 


244,1971. 

population attributable bisb PERCENT This is the attributable fraction in the pop¬ 


ulation, expressed as a percentage. Sec also attributable fraction (population), 
population based Pertaining to a general population defined by geopolitical bounda¬ 
ries; this population is the denominator and/or the sampling frame. 
population dynamics Changes in the structure of a population; loosely used ai a syn¬ 
onym for demography. 

population excess rate A measure of the amount of disease associated with exposure 
to a putative cause of the disease in the population. It is the difference between the 
rates of disease in the entire population and among the nonexposed. 


population medicine See community medicine. 

population momentum In a growing population, the phenomenon of continuing pop¬ 
ulation growth beyond the time when replacement level fertility has been achieved, 
because of the increasing size of child-bearing and younger age cohorts, resulting 
from higher fertility and/or falling mortality in preceding yean. 
population pyramid A graphic presentation of the age and sex composition of the 
population. The population pyramid is constructed by computing the percentage 
distribution of a population, simultaneously crosKlassified by sex and age, The 
percentage that each female age group is of the total is plotted on the right and the 
corresponding percentages for males are plotted on the left. A population pyramid 
is intended to provide a quick overall comprehension of age and sex structure in 
ihe population. A population whose pyramid has a broad base and narrow apex 
* * may be identified as a high fertility population. Changing shape over lime reflects 
the changing composition of the population, associated with changes in fertility and 
mortality at each age. 

Since the figure it two dimensional, the word -pyramid" is incorrectly used, but 


the more accurate word -profile" has never caught on. 

Source: https://www.industrydc 
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potency 


prevention 


rate to monitor in developing countries where older infants frequently die of infec¬ 
tions and malnutrition. 

potency The strength of a particular drug, toxin, or hazard; the ratio of the dose of a 
standard amount required to elicit a specific response, to the dose of the test agent 
that elicits the same response. 

POTENTIAL YEARS of ufe lost (PYLL) A measure of the relative impact of various 
diseases and lethal forces on society. PYLL highlights the loss to society as a result 
of youthful or early deaths. The figure for potential years of life lost due to a 
particular cause is the sum, over all persons dying from that cause, of the years that 
these persons would have lived had they experienced normal life expectation. The 
concept derives from Petty’s Political Arithmetic (1687) and is elaborated upon in 
Dublin and Lotka's Money Value of a Man (1930). 

power A characteristic of a statistical hypothesis test, denoting the probability that the 
null hypothesis will be rejected if it is indeed false. It is equal to 1 minus the prob¬ 
ability of type II error. See also error, resolution. Resolving power is the com¬ 
parable propeny of individual measurements. 

pragmatic study A study whose aim is to improve health status or health care of a 
specified population, provide a basis for decisions about health care, or evaluate 
previous actions. See also explanatory study; community diagnosis; program 

REVIEW. 

PRECISION 

1. The quality of being sharply defined or stated. One measure of precision is 
the number of distinguishable alternatives from which a measurement was 
selected, sometimes indicated by the number of significant digits in the mea¬ 
surement. Another measure of precision is the 'tandard error of measure¬ 
ment, the standard deviation of a series of replicate determinations of the 
same quantity. Precision does not imply accuracy. See also measurement, 
problems with terminology. 

2. In statistics, precision is defined as the inverse of the variance of a measure¬ 
ment or estimate. 

precursor An early stage in the course of a disease, or a condition or state preceding 
pathological onset of a disease; sometimes detectable by screening; may be identi¬ 
fied as a RISK MARKER. 

predictive VALUE In screening and diagnostic tests, the probability that a person with 
a positive lest is a true positive (i.e., does have the disease) is referred to as the 
"predictive value of a positive test." The predictive value of a negative test is the 
probability that a person with a negative test does not have the disease. The predic¬ 
tive value of a screening test is determined by the sensitivity and specificity of the 
test, and by the prevalence of the condition for which the test is used. See also 
screening; sensitivity and specificity. 

pREMUNmoN A term used mainly in the epidemiology of parastic diseases, especially 
malaria. It signifies a stale of resistance, in a host harboring a parasite, to superin¬ 
fection by a parasite of the same species. This state is dependent on the continued 
survival of parasites in the body and disappears after their elimination. It may be 
complete or partial. 

prepatent period In parasitology, the period equivalent to the incubation period of 
microbial infections; the corresponding phase may be biologically different from 
• microbial multiplication when the invading organism is a multicellular parasite that 
undergoes developmental stages in the host. 

PRESCRIPTIVE SCREENING See SCREENING, 



prevalence The number of instances of a given disease or other condition in a given 
population at a designated lime; sometimes used to mean prevalence rate. When 
used without qualification the term usually refers to the situation at a specified 
point in time (point prevalence). 

prevalence, annual (An occasionally used index) The total number of persons 
with the disease or attribute at any time during a year. It includes cases of the 
disease arising before but extending into or through the year as well as those having 
their inception during the year. 

prevalence, lifetime The tout number of persons known to have had the disease 
or attribute for at least part of their life. 

prevalence, period The total number of persons known to have had the disease 
or attribute at any time during a specified period. 

prevalence, point The number of persons with a disease or an attribute at a 
specified point in time. 

prevalence rate (ratio) The total number of all individuals who have an attribute 
or disease at a particular time (or during a particular period) divided by the popu¬ 
lation at risk of having the attribute or disease at this point in time or midway 
through the period. A problem may arise with calculating period prevalence rates 
because of the difficulty of defining the most appropriate denominator. Sec also 
prevalence. 

prevalence study See cross-sectional study. 

preventarle fraction (population) In a situation in which exposure to a given factor 
is believed to protect against a disease (or other outcome), the preventable fraction 
in the population is the proportion of the disease (in the population) that would be 
prevented if the whole population were exposed to the factor. This value must be 
interpreted with caution, as part or all of the apparent protective effect may be due 
to other factors associated with the apparent protective factor. 

In a study of a total population, the preventable fraction (population) is com¬ 
puted as l p -l t * where l p is the incidence rate of the disease (or other outcome) 

Ip 

in the population, and l t is the incidence rate in the exposed persons in the popu¬ 
lation. 

prevented fraction (population) In a situation in which exposure to a given factor is 
believed to protect against a disease (or other outcome), the prevented fraction b 
the proportion of the hypothetical total load of disease (in the population) that has 
been prevented by exposure to the factor. This value must be interpreted with 
caution, as part or all of the apparent protective effect may be due to other factors 
associated with the apparent protective factor. 

In a study of a total population the prevented fraction b computed as /«->/,, 

~TT 

where l p is the rate of the disease in the population, and /. b the rale among people 
unexposed to the factor. 

prevention The goals of medicine are to promote health, to preserve health, to restore 
health when it is impaired, and to minimise suffering and distress. These goals are 
embodied in the word “prevention," which b easiest to define in the context of 
levels, customarily called primary, secondary, and tertiary prevention. Authorities 
on pacvENTivE medicine do not agree on the precise boundaries between these 
levels, nor on how many levels can be dbtingubhed, but the differences of opinion 
are semantic rather than substantive. 

An epidemiologic interpretation of the distinction between primary and second- 
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ary prevention is that primary prevention is aimed at reducing incidence of disease 
and other departures from good health, secondary prevention aims to reduce prev¬ 
alence by shortening the duration, and tertiary prevention is aimed at reducing 
complications. 

Primary prevention can be defined as the protection of health by personal and 
community-wide effects, c.g., preserving good nutritional status, physical fitness, 
and emotional well-being, immunizing against infectious diseases, and making the 
environment safe. (But see also health promotion.) 

Secondary prevention can be defined as the measures available to individuals and 
populations for the early detection and prompt and effective intervention to correct 
departures from good health. 

Tertiary prevention consists of the measures available to reduce or eliminate long¬ 
term impairments and disabilities, minimize suffering caused by existing departures 
from good health, and to promote the patient's adjustment to irremediable condi¬ 
tions. This extends the concept of prevention into the field of rehabilitation. 
preventive medicine The application of preventive measures by clinical practitioners. 
A specialized held of medical practice composed of distinct disciplines that utilize 
skills focusing on the health of defined populations in order to promote and main¬ 
tain health and well-being and prevent disease, disability, and premature death. 

In addition to the knowledge of basic and clinical sciences and the skills common 
to all physicians, the distinctive aspects of preventive medicine include knowledge 
of and competence in biostatisiks, epidemiology, administration including plan¬ 
ning, organization, management, financing, and evaluation of health programs; en¬ 
vironmental health; application of social and behavioral factors in health and dis¬ 
ease; and the application of primary, secondary, and tertiary prevention measures 
within clinical medicine. (The above is the definition and description of the field 
that has been adopted by the American College of Preventive Medicine; for com¬ 
pleteness, at least two other items ought to be added, i.e., health education and 
nutrition). 

primary case The individual who introduces the disease into the family or group un¬ 
der study. Not necessarily the first diagnosed case in a family or group. See also 

INDEX CASE. 


PRIMARY HEALTH CARE 

1. Health care that begins at the time of first encounter between a patient and a 
provider of health care; An alternative term is primary medical care. 

2. The WHO definition of primary health care includes much more: Primary 
health care is essential health care made accessible at a cost the country and 
the community can afford, with methods that are practical, scientifically sound, 
and socially acceptable. Everyone in the community should have access to it, 
and everyone should be involved in it. Related sectors should also be involved 
in it in addition to the health sector. At the very least it should include edu¬ 
cation of the community on the health problems prevalent and on methods of 
preventing health problems from arising or of controlling them; the promo¬ 
tion of adequate supplies of food and of proper nutrition; sufficient safe water 
and basic sanitation; maternal and child health care including family planning; 
the prevention and control of locally endemic diseases; immunization against 
the main infectious diseases; appropriate treatment of common diseases and 
injuries; and the provision of essential drugs. (From Glossary of Terms Used m 
the Health for AU Senes No. IS. Geneva: WHO. 1984.) 

principal component analysu A statistical method to simplify the description of a 
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set of interrelated variables. Its general objectives are data reduction and interpre¬ 
tation; there b no separation into dependent and independent variables; the origi¬ 
nal set of correlated variables is transformed into a smaller set of uncorrelated 
variables called the principal components. Often used as the first step in a factor 
analysis. 

PRIOR probability Probability calculated or estimated from theory or belief, before a 
study is done. See bayes* theorem. 

PROBABILITY 

I. The limit of the relative frequency of an event in a sequence of N random 
trials as N approaches infinity, i.e., the limit of 


Number of occurrence of the event 
N 


& 
fcsfe 


2. A measure, ranging from zero to I, of the degree of belief in a hypothesis or 
statement. 

probability density The frequency distribution of a continuous random variable. 

probability distribdtion For a discrete random variable, the function that gives the 
probabilities that the variable equals each of a sequence of possible values. Exam¬ 
ples include the binomial and Poisson dbtribulions. For a continuous random vari¬ 
able, often used synonymously with the probability density function. 

probability sample (Syn: random sample) See sample. 

probability theory The branch of mathematics dealing with the purely logical prop¬ 
erties of probability, its theorems underly most statbtical methods. 

Proband See PROPOSITUS. 

problem-oriented medical record (pomr) a medical record in which the patient's 
history, physical findings, laboratory results, etc., are organized to give a cumulative 
record of problems, e.g., hemoptysb, rather than disease, e.g., pneumonia. The 
record includes subjective, objective, and significant negative information, discus¬ 
sions and conclusions, and diagnostic and treatment plans with respect to each 
problem. The record, which was developed by Lawrence Weed,' contrasts with the 
traditional medical record, which is less formally organized, usually recording all 
information from each source (history, physical, and laboratory findings) together 
without regard to the problems the information describes. 

Since the problems may not be described in terms of conventional disease labels, 
their classification and counting for epidemiologic purposes are sometimes difficult. 
The international classipication or health problems in primary care (ichppc) 
is an attempt to overcome this difficulty. 

'Weed LL: Medical records that guide and teach. New Entf J Med 278:593-600, 652-657. 1968. 

PROCATARcnc cause A term used by epidemiologists of the late 19th and early 20th 
centuries, probably last used by greenwood, to describe predisposing causes asso¬ 
ciated with habits of life. 

professional activity study (pas) The hospital discharge abstract system that 
covers many acute short-stay hospitals in the United States. It provides regularly 
published statistical tables arranged according to hospital service, diagnostic cate¬ 
gory, etc., giving details on diagnostic and therapeutic procedures, length of stay 
and oulcome. 


PROGRAM 
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1. A (formal) set of procedures to conduct an activity, e.g., control of malaria. 

2. An ordered list of instructions directing a computer to carry out a desired 
sequence of operations. The objective is normally the solution of a problem. 
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program evaluation and review techniques (rear) A work-scheduling method that 
uses algorithms and also enunciates general principles of procedure for allocating 
resources. Calls Tor listing specific tasks to be completed and the resources—person- 
nel, equipment, supplies, and other items—that will be needed, along with their 
costs, a time chart indicating when each component task is to begin and end, giving 
interim accomplishment levels during that period, and a specification of times for 
interim review of the progress of the plan. 

program review An evaluative study of a specific health program operating in a spe¬ 
cific setting, performed to provide a basis for decisions concerning the operation of 
the program. 

program TRIAL An experimental or quasi-experimenta! evaluative study of a (health) 
program. 

prolective Pertaining to data collected by planning in advance. Contrast retrolective. 
The terms protective and retrolective, coined by AR Feinstein 1 are said to describe 
more precisely the actions of research workers than the common terms prospective 
and retrospective; use of these terms is limited, and is deprecated by many epide¬ 
miologists. 

x CUn Pharmacol Tker 30:564-577. 1981. 

proportion A type of ratio in which the numerator is included in the denominator. 
The ratio of a pan to the whole, expressed as a "decimal fraction" (e.g.. 0.2). as a 
"vulgar fraction" ( ( /s), or, loosely, as a percentage (20%). By definition, a proportion 
(p) must be in the range (decimal) 0.01.0. Since numerator and denominator 
have the same dimension, any dimensional contents cancel out, and a proportion is 
a dimensionless quantity. Where numerator and denominator are based upon counts 
rather than upon measurements, the originals are also dimensionless, although it 
should be understood that proportions can be used for measured quantities, e.g., 
the skin area of the lower limb is x percent of the total skin area, as well as for 
counts, e.g., 0.15 of the population died. A prevalence rate is a count-based pro¬ 
portion. The nondimensionality of a proportion, and its range limitations, do not 
necessarily apply to other kinds of ratios, of which "proportion" is a subset. See also 
rate; ratio. 

proportional hazards model (Syn: Cox model) A statistical model in survival 
analysis that asserts that the effect of the study factors on the hazard rate in the 
study population is multiplicative and does not change over time. For example, the 
model for two factors x t and x, asserts that the rate at time f X (/), is given by 

where M0 is the rate when x t «x,«0, and e is the (natural) exponential function. 

proportionate MORTALITY rate, ratio (pmr) Number of deaths from a given cause 
in a specified time period, per 100 or 1000 total deaths in the same lime period. 
Can give rise to misleading conclusions if used (o compare morality experience of 
populations with different distributions of causes of death. 

propositus (Syn: proband) The family member who first draws attention to a (genetic) 
pedigree of a given trait. The index case in a genetic study. 

prospective study See cohort study. 

protocol The plan, or set of steps, to be followed in a study or investigation, or in an 
intervention program. See also algorithm, clinical. 

proximate determinant or pertiutv Factor having a direct influence on fertility; 
such factors include age at marriage, breastfeeding, abortion, and contraceptive 
use. 
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public health Public health is one of the efforts organized by society to protect, pro¬ 
mote, and restore the people's health. It is the combination of sciences, skills, and 
beliefs that is directed to the maintenance and improvement of the health of all the 
people through collective or social actions. The programs, services, and institutions 
involved emphasize the prevention of disease and the health needs of the popula¬ 
tion as a whole. Public health activities change with changing technology and social 
values, but the goals remain the same: to reduce the amount of disease, premature 
death, and disease-produced discomfort and disability in the population. Public health 
is thus a social institution, a discipline, and a practice. 
punch card A card on which data are stored by means of holes punched in specified 
positions; useful in storing, processing, and analyzing data. Edge-punch cards have 
marginal holes converted to slots by punching so that they can be manually sorted. 
The commonly used variety of punch cards have 80 columns and 12 rows. In each 
column of the card there are 12 positrons at which holes may be punched, accord¬ 
ing to a predetermined code. The position of the hole is the means of identifying 
the value of a variable. Punch cards of this type are sorted mechanically or electri¬ 
cally to provide a rapid means of processing and analyzing data, sometimes of great 
complexity. See also data processing. 

P value See P (probability). 


Source: https://www.industrydocum£nts.ucsf.edu/docs/lnbj0000 
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qalv Acronym for quality-adjusted life years; this is an adjustment of life expectancy 
that allows for prevalence of activity-limitation, assessed from hospital discharge 
data or by health survey data, in the population subgroup for which QALY is cal* 
culated. For example, the life expectancy of males at birth in Canada in 1978 was 
70.8 years; after adjusting for activity-limitation using health survey data, quality- 
adjusted life expectancy, or QALY, was 65.8 years. 1 

'Wilkins R. Adams O: H faithfulness of Life. Montreal, 1983. 

qualitative data Observations or information characterized by measurement on a 
categorical scale, i.e., a dichotomous or nominal scale, or, if the categories are or¬ 
dered, an ordinal scale. Examples arc sex, hair color, death or survival, and nation¬ 
ality. See also measurement scale. 

quality control The supervision and control of all operations involved in a process, 
usually involving sampling and inspection, in order to detect and correct systematic 
or excessively random variations in quality, 

quality or care A level of performance or accomplishment that characterizes the health 
care provided. Ultimately, measures of the quality of care always depend upon 
value judgments, but there are ingredients and determinants of quality that can be 
measured objectively. These ingredients and determinants have been classified by 
Donabedian 1 into measures of structure (e.g., manpower, facilities), process (c.g., 
diagnostic and therapeutic procedures), and outcome (e.g., case fatality rates, dis¬ 
ability rates, and levels of patient satisfaction with the service). See also health 

SERVICES RESEARCH. 

1 Donabedian A: A Guide to Medical Care Administration (Vol. 2). New York: American Public Health 

Association, 1969. 

quality or life In a general sense, that which makes life worth living. In a more 
“quantitative" sense, an estimate of remaining life free of impairment, disability or 
handicap, as used in the expression “quality adjusted life years;" somewhere be¬ 
tween these is an estimate of the utility of life—for instance, in clinical decision 
analysis, the utility of life that is impaired by a disabling degree of angina pectoris 
may be compared with that of a life that may be shorter in duration but free of 
disabling pain, as a result of applying therapeutic procedures. Such trade-offs are 
part of clinical decision analysis. See also utility. 

quantiles Divisions of a distribution into equal, ordered subgroups. Deciles are tenths; 
quartiles, quarters; quintiles, fifths; lerciles, thirds; and centiles, hundredths. 

quantitative data Data in numerical quantities such as continuous measurements or 
counts. 

quarantine The 14th edition of Control of Communicable Disease m Man 1 gives the fol¬ 
lowing: 

Restriction of the activities of well persons or animals who have been exposed to 
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a case of communicable disease during its period of communicability (i.e., contacts) 
to prevent disease transmission during the incubation period if infection should oc¬ 
cur. 

a) Absolute or complete quarantine: The limitation of freedom of movement of 
those exposed to a communicable disease for a period of lime not longer than 
the longest usual incubation period of that disease, in such manner as to prevent 
effective contact with those not so exposed (see Isolation). 

b) Modified quarantine: A selective, partial limitation of freedom of movement of 
contacts, commonly on the basis of known or presumed differences in suscepti¬ 
bility and related to the danger of disease transmission. It may be designed to 
meet particular situations. Examples are exclusion of children from school, ex¬ 
emption of immune persons from provisions applicable to susceptible persons, or 
restriction of military populations to the post or to quarters. It includes: Personal 
surveillance, the practice of close medical or other supervision of contacts in or¬ 
der to permit prompt recognition of infection or illness but without restricting 
their movements; and Segregation, the separation of some part of a group of 
persons or domestic animals from the others for special consideration, control or 
observation—removal of susceptible children to homes of immune persons, or 
establishment of a sanitary boundary to protect uninfected from infected por¬ 
tions of a population. 

See also isolation. 

'Washington DC: American Public Health Association, 1985. 

quasi-exferiment An experiment in which the investigator lacks full control over the 
allocation and/or the timing of the intervention. 

questionnaire A predetermined set of questions used to collect data—clinical data, 
social status, occupational group, etc. This term is often applied to a self-completed 
survey instrument, as contrasted with an interview schedule. 

Quetelet, Lambert Adouphe Jacques (1796*1857) Belgian astronomer, statistician, 
and social scientist, one of the first to apply statistical thinking to the social and 
biological sciences, e.g., in delineating the (normal) distribution of variables such as 
height in the population. He influenced others who followed, e.g., Florence night¬ 
ingale. 

quetelet's index See rodv mass index. 

quota sampling A method by which the proportions in the sample in various subgroups 
(according to criteria such as age, sex, and social status of the individuals to be 
selected) are chosen to agree with the corresponding proportions in the population. 
The resulting sample may not be representative of characteristics that have not 
been taken into account. 

quotient The result of the division of a numerator by a denominator. 


Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 
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race Persons who are relatively homogeneous with respect to biological inheritance. 
See also ethnic croup. 

radix The hypothetical size of the birth cohort in a life table, commonly 1000 or 100.000. 

Rahe-Holmes social readjustment rating scale See life events. 

Ramazzini, Bernardino (1633-1714) An Italian physician, “Father of Occupational 
Medicine;" he published De Mortis Artificum (On the Diseases of Workers) in 1700. 
Based on observation and anecdote, this "was the first systematic account of diseases 
related to workplace exposures. 

random Governed by chance; not completely determined by other factors. As opposed 
to deterministic. 

RANDOM ALLOCATION See RANDOMIZATION. 

randomization Allocation of individuals to groups, e.g., for experimental and control 
regimens, by chance. Within the limits of chance variation, randomization should 
make the control and experimental groups similar at the start of an investigation 
and ensure that personal judgment and prejudices of the investigator do not influ¬ 
ence allocation. 

Randomization or random assignment should not be confused with haphazard 
assignment. Random assignment follows a predetermined plan that is usually de¬ 
vised with the aid of a table of random numbers. The pattern of assignment may 
appear to be haphazard, but this arises from the haphazard nature with which 
digits occur in a uble of random numbers, and not from the haphazard whim of 
the investigator in allocating patients. 

randomized controlled trial (rct) An epidemiologic experiment in which subjects 
in a population are randomly allocated into groups, usually called "study" and "con¬ 
trol" groups, to receive or not to receive an experimental preventive or therapeutic 
procedure, maneuver, or intervention. The results are assessed by rigorous com¬ 
parison of rates of disease, death, recovery, or other appropriate outcome in the 
study and control groups, respectively. Randomized controlled trials are generally 
regarded as the most scientifically rigorous method of hypothesis testing available 
in epidemiology. A few authors refer to this method as "randomized control trial.” 
See also experimental epidemiology. 

random sample A sample that is arrived at by selecting sample units such that each 
possible unit has a fixed and determinate probability of selection. See also sample. 

range of distribution The difference between the largest and smallest values in a 
distribution. 

ranring scale (Ordinal Scale) A scale that arrays the members of a group from high 
to low according to the magnitude of the observations, assigns numbers to the ranks, 
and neglects distances between members of the array. 



rate A rate is a measure of the frequency of a phenomenon. In epidemiology, demog- 
raphy, and vital statistics, a rate is an expression of the frequency with which an 
event occurs in a defined population; the use of rates rather than raw numbers is 
essential for comparison of experience between populations at different times, dif¬ 
ferent places, or among different classes of persons. 

The components of a rate are the numerator, the denominator, the specified time 
in which events occur, and usually a multiplier, a power of 10, which converts the 
rate from an awkward fraction or decimal to a whole number: 

Number of events in specified period _ 

Rate « -——-~-r- r — -r-r X 10* 

Average population during the period 

All rates are ratios, calculated by dividing a numerator, e.g., the number of deaths, 
or newly occurring cases of a disease in a given period, by a denominator, e.g., the 
average population during that period. Some rates arc proportions, i.e., the nu¬ 
merator is contained within the denominator. Rate has several different usages in 
epidemiology. 

1. As a synonym for ratio, it refers to proportions as rates, as in the terms cu¬ 
mulative incidence rate, prevalence rate, survival rate (cf. Webster's Dictionary, 
which gives proportion and ratio as synonyms for rate). 

2. In other situations, rate refers only to ratios representing relative changes (ac¬ 
tual or potential) in two quantities. This accords with the OED , which gives 
"relative amount of variation” among its entries for rate. 

3. Sometimes rate is further restricted to refer only to ratios representing changes 
over lime. In this usage, prevalence rate would not be a "true" rate because it 
cannot be expressed in relation to units of time but only to a "point" in time; 
in contrast, the force of mortality or force of morbidity (hazard rate) is a "true” 
rate for it can be expressed as the number of cases developing per unit time, 
divided by the total size of the population at risk. 

rate difference (rd) The absolute difference between two rates, for example, the 
difference in incidence rate between a population group exposed to a causal factor 
and a population group not exposed to the factor: 

where ! t m incidence rate among exposed, and /„*incidence rate among unex¬ 
posed. In comparisons of exposed and unexposed groups, the term excess rate may 
be used as a synonym for rate difference. 

RATE-ODDS RATIO See ODDS .RATIO. 

RATE RATIO <RR) The ratio of two rates. The term b used in epidemiologic research 
with a precise meaning, i.e., the ratio of the rate in the exposed population to the 
rate in the unexposed population: 



where l 9 is the incidence rate among exposed, and /. b the incidence rate among 
unexposed. See also relative risk. 

ratio The value obtained by dividing one quantity by another: a general term of which 
rate, proportion, percentage, etc., are subsets. The important difference between a 
proportion and a ratio b that the numerator of a proportion is included in the 
population defined by the denominator, whereas thb b not necessarily so for a 
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ratio. A ratio is an expression of the relationship between a numerator and a de¬ 
nominator where the two usually are separate and distinct quantities, neither being 
included in the other. 

The dimensionality of a ratio is obtained through algebraic cancellation, sum¬ 
mation, etc., of the dimensionalities or its numerator and denominator terms. Both 
counted and measured values may be included in the numerator and in the denom¬ 
inator. There are no general restrictions on the dimensionalities or ranges of ratios, 
as there are in some of its subsets (e.g., proportion, prevalence). Ratios are some¬ 
times expressed as percentages (e.g., standardized mortality ratio, FE\\ percent). 
In these cases, unlike the special case of a proportion, the value may exceed 100. 
See also proportion; rate. 

RATIO SCALE See MEASUREMENT SCALE. 

receiver operating characteristic (roc) curve (Syn: relative operating character¬ 
istic curve) A graphic means for assessing the ability of a screening test to discrim¬ 
inate between healthy and diseased persons. The term “receiver operating charac¬ 
teristic** comes from psychometry where the characteristic operating response of a 
receiver-individual to faint stimuli or non stimuli was recorded. 

record linkage A method for assembling the information contained in two or more 
records, e.g., in different sets of medical charts, and in vital records such as birth 
and death certificates, and a procedure to ensure that the same individual is counted 
only once. This procedure incorporates a unique identifying system such as a per¬ 
sonal identification number and/or birth rtame(s) of the individuals mother. 

Record linkage makes it possible to relate significant health events that arc re¬ 
mote from one another in time and place or to bring together records of different 
individuals, e.g., members of a family. The resulting information is generally stored 
and retrieved by computer, which can be programmed to tabulate and analyze the 
data. 

"Each person in the world creates a book of life. This book starts with birth and 
ends with death. Its pages are made of the records of the principal events in life. 
Record linkage is the name given to the process of assembling the pages of this 
book into a volume." 1 

* Dunn HL: Record linkage. Am J Pub Health 36:1412, 1946 

recrudescence Reactivation of infection. 

Reed, Walter (1851-1902) US Army physician and epidemiologist. Responsible for 
epidemiologic investigations and experiments that established the transmission of 
yellow fever by a filterable virus carried by culicine mosquitoes. The rigorous logic 
applied to both the experimental and incidental observations by Reed and his col¬ 
leagues is recognized as one of the great achievements of medical science. 

reference pofulation The standard against which a population that is being studied 
can be compared. 

refinement The process of identifying new subcategories or study variables for the 
purpose of more accurate or more detailed description of relationships. An exam¬ 
ple is refinement of the concept of serum cholesterol level into high, low, and very 
low density lipoproteins. 

register, registry In epidemiology the term “register" is applied to the file of data 
concerning all cases of a particular disease or other health-relevant condition in a 
defined population such that the cases can be related to a population base. With 
this information incidence rates can be calculated. If the cases are regularly fol¬ 
lowed up, information on remission, exacerbation, prevalence, and survival can also 
be obtained. The register is the actual document, and the registry js the system of 
ongoing registration. 



In most developed countries all births and deaths are recorded through birth 
and death registration systems. Results and summaries are then tabulated and pub¬ 
lished. Examples of registries that have epidemiologic value include the following: 

Cancer registries , which secure reports of cancer patients as soon as possible alter 
first diagnosis. The principal sources for these reports are the hospitals serving the 
community, but a few cases are not reported until death. 

Tttnn registries , which have provided the basis for studies attempting to differen¬ 
tiate genetic from environmental factors in the etiologv of cancer, and other con¬ 
ditions where both genetic and environmental factors may be contributing causes. 

Birth defect registries, which seek to document anomalies that are apparent at or 
soon after birth. They suffer from incompleteness due to omission of stillbirths and 
of anomalies that do not declare their presence until later in life, such as certain 
forms of congenital heart lesion, mental deficiency, and neurological disorders. 

Other types of registers include blindness and other forms of physical handicap, 
high-risk infants, persons addicted to drugs, etc. Most of these, however, arc not 
truly population based, but merely list those persons known to or attending some 
agency or service that provides for them. 

registration The term "registration" implies something more than notification for the 
purpose of immediate action or to permit the counting of cases. A register requires 
that a permanent record be established, including identifying data. Cases may be 
followed up, and statistical tabulations may be prepared both on frequency and on 
survival. In addition, the persons listed on a register may be subjects of special 
studies. 

REGRESSION 

1. As used by francis galton, regression meant the tendency for offspring of 
exceptional parents (very tall, very intelligent, etc.) to possess characteristics 
closer to the average for the general population. (Hence, "regression to the 
mean") 

2. In statistics, regression is a synonym for regression analysis. 

regression ANALYSIS Given data on a dependent variable y and one or more indepen¬ 
dent variables X|. x*. etc. regression analysis involves finding the "best" mathematical 
model (within some restricted class of models) to describe y as a function of the x‘s, 
or to predict y from the x’s. The most common form is a linear model; in epide¬ 
miology, the logistic and proportional hazards models are also common. 

regression line Diagrammatic presentation of a regression equation, usually drawn 
with the independent variable, x. as the abscissa and the dependent variable, j, as 
ordinate. Three variables can be shown diagrammatically on an isometric chart or 
stereogram. 

relationship See association. 

relative odds See odds ratio. 

RELATIVE RISK 

1. The ratio of ihc risk of disease or death among the exposed to the risk among 
the unexposed; this usage is synonymous with risk ratio. 

2. Alternatively, the ratio of the cumulative incidence rate in the exposed to the 
cumulative incidence rate in the unexposed, i.e., the cumulative incidence ra¬ 
tio. 

3. The term "relative risk" lias also been used synonymously with ‘ odds ratio" 
and, in some biostatistical articles, has been used for the ratio of forces of 
morbidity. The use of the term "relative risk" lor several different quantities 
arises from the fact that for "rare" diseases (e.g., most cancers) all the quan¬ 
tities approximate one another. For common occurrences (e.g., neonatal mor- 
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tality in infants under I500*g birth weight), the approximations do not hold. 

See also cumulative incidence ratio; odds ratio; rate ratio; risk ratio, 
reliability The degree of stability exhibited when a measurement is repeated under 
identical conditions. Reliability refers to the degree to which the results obtained by 
a measurement procedure can be replicated. Lack of reliability may arise from dj- 
vergences between observers or instruments of measurement or instability of the 
attribute being measured. See also measurement, problems with terminology; 
observer variation. 

repeatability (Svn: reproducibility) A test or measurement is repeatable if the results 
are identical or closely similar each time it is conducted. See also measurement, 
problems with terminology; reliability. 

replacement level fertility The level of fertility at which a cohort of women are 
having only enough daughters to replace themselves in the population. By defini¬ 
tion, replacement level fertility is equal to a net reproduction rate of 1.00. The total 
fertility rate is also used as a measure of replacement level fertility; in the United 
States today, a total fertility rate of 2.12 is considered to be replacement level; it is 
higher than 2 because of mortality and because of a sex ratio greater than I at 
birth. The higher the mortality rate, the higher is replacement level fertility. 
replication The execution of an experiment or survey more than once so as to con¬ 
firm (he findings, increase precision, and obtain a closer estimation of sampling 
error. Exart replication should be distinguished from romufena of results on replication . 

Exact replication is often possible in the physical sciences, but in the biological and 
behavioral sciences, to which epidemiology belongs, consistency ol results on repli¬ 
cation is often the best that can be attained. Consistency of results on replication is 
perhaps the most important criterion in judgments of causality. 
representative sample The lerm “representative" as it is commonly used is unde¬ 
fined in the statistical or mathematical sense; it means simply that the sample re¬ 
sembles the population in some way. 

The use of probability sampling will not ensure that any single sample will be 
"representative" of the population in all possible respects. If. for example, it is (uund 
that the sample age distribution is quite different from that of the population, it is 
possible to make corrections for the known differences. A common fallacy lies in 
the unwarranted assumption that, if the sample resembles the population closely 
on those factors that have been checked, it is “totally representative*' and that no 
difference exists between the sample and the universe or reference populating 

Kendall and Buckland 1 comment as follows: M in the widest sense, a sample which 
is representative of a population. Some confusion arises according to whether 'rep¬ 
resentative' is regarded as meaning 'selected by tome process which gives all sam¬ 
ples an equal chance of appearing to represent the population'; or, alternatively, 
whether it means 'typical in respect of certain characteristics, however chosen'. On 
the whole, it seems best to confine the word 'representative' to samples which turn 
out to be so, however chosen, rather than apply it to those chosen with the object 
of being representative." 

1 Kendall M(#, Buckland WR: A Dictionary of Staiutieot Ttrm, 4th ed. London: Longman, 1982. 
REPRODUCIBILITY See REPEATABILITY. 

reproductive isolation Absence of interbreeding between populations. 
research design The procedures and methods, predetermined by an investigator, to 
be adhered to in conducting a research project. 
reservoir of infection 

I. Any person, animal, arthropod, plant, soil, or substance, or a combination of 
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these, in which an infectious agent normally lives and multiplies, on which it 
depends primarily for survival, and where it reproduces itself in such a man¬ 
ner that it can be transmitted to a susceptible host. 

2. The natural habitat of (he infectious agent, 
i resolution (Syn: resolving power) A component of a measuring instrument that helps 
f determine precision. The degree of refinement of the measuring process is com- 
monly referred to as the "resolution' or the "resolving power of the system." See 
j also power. The capability of distinguishing between things that are indeed sepa- 
i rate or distinct from one another. 

f resolving power The capacity of a system to distinguish between truly distinct things 
j that are close together. 

, response rate The number of completed or returned survey instruments (question- 
| naires. interviews, etc.) divided by the total number of persons who would have 

1 been surveyed if all had participated. Usually expressed as a percentage. Nonres- 

; ponsc can have several causes, e.g., death, removal out of the survey community, 

£ and refusal. See also bias: completion rate; non participants. 

? retrolective Pertaining to data gathered from medical records or other sources, when 
'* data collection look place without prior planning for the needs of an investigation, 
i Set* also prolective; term in limited use. 

retrospective study A research design that is used to test eiiotogk hypotheses in 
which inferences about exposure to the putative causal factor(s) are derived from 
data relating to characteristics of the persons under study or to events or experi¬ 
ences in their past. The essential feature is that some of the persons under study 
have the disease or other outcome condition of interest, and their characteristics 
and past experiences are compared with those of other, unaffected persons. Per¬ 
sons who differ in the severity of (lie disease may also be compared. There is dis¬ 
agreement among epidemiologists as to the desirability of using the term "retro¬ 
spective study" rather than "case control study" to describe this method. See also 

CASE CONTROL STUDY. 

retrovirus This name is given to a family of RNA viruses characterized by the pres¬ 
ence of an enzyme, reverse transcriptase, that enables transcription of RNA to UNA 
inside an affected cell. Thus, retroviruses can make copies of themselves in host 
cells. The most important retrovirus is the human immunodeficiency virus (HIV); 
this makes copies of itself in host cells such as T4 "helper" lymphocytes and normal 
immune responses are disrupted. 

risk The probability that an event will occur, e.g., lhat an individual will become ill or 
die within a stated period of time or age. Also, a nontechnical term encompassing 
a variety of measures of the probability of a (generally) unfavorable outcome. See 
also PROBABILITY. 

risk assessment The qualitative or quantitative estimation of the likelihood of adverse 
effects that may result from exposure to specified health hazards or from the al>- 
sence of beneficial influences. 

risk benefit analysis The process of analyzing and comparing on a single scale the 
exacted positive (benefits) and negative (risks, costs).results of an action, or lack of 
an action. 

risk benefit ratio The results of a risk benefit analysis, expressed as the ratio of risks 
to benefits. 

risk difference (Syn: excess risk) The absolute difference between two risks. 
risk factok An aspect of personal behavior or lifestyle, an environmental exposure, 
or an inborn or inherited characteristic, which on (he basis of epidemiologic evi* 
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Hik management 

dcnce is known to be associated with health-related condiiion(s) considered impor- 
lam to prevent. The term "risk factor” is rather loosely used, with any of the fol- 
lowing meanings: 

1. An attribute or exposure that is associated with an increased probability of a 
specified outcome, such as the occurrence of a disease. Not necessarily a causal 
factor. A risk marker. 

2. An attribute or exposure that increases the probability of occurrence of dis¬ 
ease or other specified outcome. A determinant. 

3. A determinant that can be modified by intervention, thereby reducing the 
probability of occurrence of disease or other specified outcomes. To avoid 
confusion, it may be referred to as a modifiable risk factor. 

iusk management The steps taken to alter, i.e., reduce, the levels of risk to which an 
individual or a population is subject. 

risk marker (Svn: risk indicator) An attribute that is associated with an increased 
probability of occurrence of a disease or other specified outcome and that can be 
used as an indicator of this increased risk. Not necessarily a causal factor. See also 

RISK FACTOR. 

risk ratio The ratio of two risks. 

robust A statistical test or procedure is said to be robust if it is not very sensitive to 
departures from the assumptions on which it is strictly predicted (e.g.. that the data 
are normally distributed). 

Ross, Ronald (1857-1932) Continued in India the work begun by Laveran and Man- 
son on mosquitoes as vectors of infectious disease. In a scries of experiments and 
microscopic dissections, he concluded thai only the anopheles mosquitoes carried 
the malaria parasite and that a developmental stage of the parasite took place in 
the mosquiio (On some peculiar pigmented cells found in two mosquitoes fed on 
malarial blood Bnt MedJ 1786-1787, 1897). Awarded the Nobel prize for medicine 
in 1902. 

rubric Section or chapter heading. Used in epidemiology with reference to groups of 
diseases, e.g.. as in the international classification of disease (icd). 
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safety factor A multiplicative factor incorporated in risk assessments or safety stan¬ 
dards to allow for unpredictable types of variation, such as variability from test 
animals to humans, random variation within an experiment, and person-to-person 
variability. Safety factors are often in the range of 10 to 1000. 
safety standards Under the requirements of the Occupational Safety and Health Act 
(OSH A, 1970), “occupational safety and health standard” means a standard that 
requires conditions, or the adoption of one or more practices, means, methods, 
operations, or processes reasonably necessary or appropriate to provide safe or 
healthful employment and places of employment. Safety standards may be adopted 
by national consensus or established by federal regulation. These standards have 
been adopted in many other nations besides the United States, although some Eu¬ 
ropean and other countries have their own standards, which may be lower or higher 
than those in the United States. 

There arc several varieties of safely standards: 

1. OSH A-promulgated, mainly for carcinogens, also for cotton dust and lead. 
These are Pcrmissable Exposure Limits (PELs). 

2. National Institute of Occupational Safety and Health (NIOSH) recommenda¬ 
tions. often lower limits, based on animal toxicity tests, empirical observations, 
epidemiologic investigations: these are Recommended Exposure Limits (RELs). 

3. An older-established set of criteria has been set by the American Conlerence 
of Governmental Industrial Hygienists: these are Thrcshhold Limit Values 
(TLVs) that have now replaced an earlier set of Maximum Allowable Concen¬ 
trations (MACs). 

sample A selected subset of a population. A sample may be random or nonrandom 
and may be representative or nonrepresentative. Several types of sample can be 
distinguished, including the following: 

Clustn sample: Each unit selected is a group of persons (all persons in a city block, 
a family, etc.) rather than an individual. 

Grab sample (Syn: sample of convenience): These ill-defined terms describe sam¬ 
ples selected by easily employed but basically non probabilistic methods. "Man-in- 
the-street” surveys and a survey of blood pressure among volunteers who drop in 
at an examination booth in a public place are in this category. It is improper to 
generalize from the results of a survey based upon such a sample for there is no 
way of knowing what sorts of bias may have been operating. See also bias. 

Probability (random) sample: All individuals have a known chance or selection. They 
may all have an equal chance of being selected, or, if a stratified sampling method 
is used, the rate at which individuals from several subsets are sampled can be varied 
so as to produce greater representation of some classes than of others. 

A probability sample is created by assigning an identity (label, number) to all 
individuals in the “universe” population, e.g., by arranging them in alphabetical 
order and numbering in sequence, or simply assigning a number to each, or by 
grouping according to area of residence and numbering the groups. The next step 
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is to select individuals (or groups) for study by a procedure such as use of a table 
of random numbers (or comparable procedure) to ensure that the chance of selec¬ 
tion is known. 

Simple random sample: In this elementary kind of sample each person has an equal 
chance of being selected out of the entire population. One way of carrying out this 
procedure is to assign each person a number, starting with I, 2. 3, and so on. Then 
numbers are selected at random, preferably from a table of random numbers, until 
the desired sample size is attained. 

Stratified random sample: This involves dividing the population into distinct subgroups 
according to some important characteristic, such as age or socioeconomic status, 
and selecting a random sample out of each subgroup. If the proportion of the 
sample drawn from each of the subgroups, or strata, is the same as the proportion 
of the total population contained in each stratum (e.g., age group 40-59 constitutes 
20 % of the population, and 20% of the sample comes from this age stratum), then 
all strata will be fairlv represented with regard to numbers of persons in the sample. 

Sufmafir sample: The procedure of selecting according to some simple, systematic 
rule, such as all persons whose names begin with specified alphabetic letters, born 
on certain dates, or located at specified points on a master list. A systematic sample 
mav lead to errors that invalidate generalizations. For example, persons' names 
more often begin with certain letters of the alphabet than with other letters, e.g., q, 
x. A systematic alphabetical sample is therefore likely to be biased. 

sample, epsem ("equal probability of selection method") A sample selected in such a 
manner that all the population units have the same probability of selection. A sim¬ 
ple random sample is an Epsem sample; a stratified sample is not unless the prob¬ 
ability of selection is the same for all strata. 

sampling The process of selecting a number of subjects from all the subjects in a par¬ 
ticular group or "universe." Conclusions based on sample results may be attributed 
onlv to the population sampled. Any extrapolation to a larger or different popula¬ 
tion is a judgment or a guess and is not part of statistical inference. 

SAMPLING ERROR See ERROR. 

sampling variation Since the inclusion of individuals in a sample is determined by 
chance, the results of analysis in two or more samples will differ, purely by chance. 
This is known as "sampling variation." 

SANITARY CORDON See CORDON SANITAIRE. 

scatter diagram (Syn: scattcrgram) A graphic method of displaying the distribution 
of two variables in relation to each other. The values for one variable are measured 
on the horizontal axis and the values for the other on the vertical axis. 

scenario ruildinc A method of predicting the future that relies on a series of as¬ 
sumptions about alternative possibilities, rather than on simple extrapolation of ex¬ 
isting trends. Trend lines for demographic composition, morbidity and mortality 
rates, etc., can then be modified by allowing for each assumption in turn, or com¬ 
binations of assumptions. The method is claimed to lead to greater flexibility in 
long-range health planning than simple forecasting that relies only upon extrapo¬ 
lation of trends. 

screening Screening was defined in 1951 by the US Commission on Chronic Illness as, 
"The presumptive identification of unrecognized disease or defect by the applica¬ 
tion of tests, examinations or other procedures which can be applied rapidly. 
Screening tests son out apparently well persons who probably have a disease from 
those who probably do not. A screening lest is not intended to be diagnostic. Per¬ 
sons with positive or suspicious findings must be referred to their physicians for 
diagnosis and necessary treatment." 
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Screening is an initial examination only, and positive responders require a sec¬ 
ond, diagnostic examination. The initiative for screening usually comes from the 
investigator or the person or agency providing care rather than from a patient with 
a complaint. Screening is usually concerned with chronic illness and aims to detect 
disease not yet under medical care. 

There arc different types of medical screening, each with its own aim; mass, 
multiple or muhiphask, and prescriptive. 

Mass screening simply means the screening of a whole population. 

Multiple or muhiphask screening involves the use of a variety of screening tests 
on the same occasion. 

Prescriptive screening has as its aim the early detection in presumptively healthy 
individuals of disease that can be controlled better if detected early in its natural 
history. 

The characteristks of a screening test include accuracy, estimates of yield, preci¬ 
sion. reproducibility, sensitivity and specificity, and validity. See entries under these 
headings. 

screening level The normal limit or cutofT point at whkh a screening test is regarded 
as positive. 

seasonal variation Change in physiological status or in disease occurrence that con¬ 
forms to a regular seasonal pattern. 

secondary attack rate The proportion of contacts who get a communicable disease 
as a consequence of contact with a case. The secondary attack rate is a measure 
of contagiousness and is useful in evaluating control measures. See also attack 

RATE. 

secular TREND (Syn: temporal trend) Changes over a long period or time, generally 
years or decades. Examples include the decline of tuberculosis mortality and the 
rise, followed by a decline, in coronary heart disease mortality in the United Slates 
and many other countries in the past 50 years. 

selection In genetks, the force that brings about changes in the frequency of alleles 
and genotypes in populations through differential reproduction. In epidemiology, 
the process and procedure for choosing individuals for study, usually by an orderly 
means such as random allocation. 

SELECTION HAS See BIAS. 

Semmelwos, Ignaz Philipp (1818-1865) An Austro-Hungarian physkian-obstetrkian, 
who discovered the cause of puerperal fever by carefully comparing infection rates 
in two wards of the Allgemeines Krankenhaus in Vienna. In one ward students cus- 
tomarilv came direct from the mortuary or the dissecting room to the patients' 
bedside whereas in the other, they did not. Puerperal infection death rates were 
much greater in the former. Semmelwcis concluded that some morbid factor was 
thus transmitted to women in the worse-affected ward. Unhappily, his conclusions 
were rejected by his colleagues. .... r 

sensitivity and srECincmr (of a screening test) Smitliwft is the proportion of truly 
diseased persons in the screened population who are identified as diseased by the 
screening test. Sensitivity is a measure of the probability of correctly diagnosing a 
case, or the probability that any given case will be identified by the test (Syn: true 
positive rate). 

Speofieity is the proportion of truly nondiseased persons who are so identified by 
the screening test. It is a measure of the probability of correctly identifying a non¬ 
diseased person with a screening test (Syn: true negative rate). The relationships 
are shown in the following fourfold table, in whkh the letters u, b, r, and d repre¬ 
sent the quantities specified below the table. 
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creening test results 

True status 

’ositive 

Diseased 

a 

Not diseased 
b 

Negative 

c 

d 

JTotal 

ll + C 

6 + rf 


L Diseased individuals delected by the lest (true positives) 
Nondiseascd individuals positive by the test (false positives) 

I*. Diseased individuals not detectable by the test (false negatives) 
•rf, Nondiseascd individuals negative by the test (true negatives) 

Sensitivity - ^ Specificity » 

Predictive value (positive test result) = y-y 

Predictive value (negative lest result) * y—j 


Total 


a + 6 
c+d 

a+b+c+d 


See also Youden's test. 

sensitivity testing A studv of how the final outcome of an analysis changes as a func¬ 
tion of varying one or more of the input parameters in a prescribed manner. 

sentinel health event A condition that can be used to assess the stability or change 
in health levels of a population, usually bv monitoring mortality statistics. Thus, 
death due to acute head injury is a sentinel event for a class ol severe traffic injury 
that may be reduced by such preventive measures as use ol seatbelts and crash 
helmets. 

sentinel physician* sentinel practice In familv medicine, a physician, practice, that 
undertakes to maintain surveillance for and report certain specific predetermined 
events, such as cases of certain communicable diseases, adverse drug reactions. 

sequential analysis A statistical method that allows an experiment to be ended as 
soon as an answer of the desired precision is obtained. Study and control subjects 
arc randomly allocated in pairs or blocks. The result ol the comparison of each pair 
of subjects, one treated and one control, is examined as soon as it becomes available 
and is added to all previous results. 

serendipity The accidental (and happy) discovery ol important new information. A 
well-known example is Fleming s discovery of the bacteriocidal properties of peni¬ 
cillin mould. In case-control studies aimed at testing a specific hypothesis, e.g., about 
the relationship between tobacco and cancer, questions on other aspects of life-style 
have serendipitouslv revealed statistically significant associations, e.g,. between al¬ 
cohol consumption and certain cancers. 

SEROEPiDEMiOLOGY Epidemiologic study or activity based on the detection on serologi¬ 
cal testing of characteristic change in the serum level of specific antibodies. Latent, 
subclinkal infections and carrier states can thus be detected, in addition to clinically 
oven cases. 

sex ratio The ratio of one sex to the other. Usually defined as the ratio of males to 
f emales (or of the rates observed in males and females). 

"shoe-leather” epidemiology Gathering information for epidemiologic studies by 
direct inquiry among the people, e.g., walking from door to door and asking ques¬ 
tions of every householder (wearing out shoe leather in the process). JOHN snow 
did this when investigating the sources of water supply to households in the cholera 
epidemic in London in 1854; the method has been successfully used in many sub* 


sequent epidemic investigations. It is especially useful in 
transmitted diseases. 

sirlincs Children borne by the same mother. 

sirship All the brothers and sisters borne by the same mothet 

sickness See disease. 

side effect An effect, other than the intended one, produo 
nostic, or therapeutic procedure or regimen. 

SIGNAL-TO-NOI5E ratio A jargon term for the relationship of 
which is extraneous or irrelevant, or intrudes because n 
other procedures are insufficiently sensitive. 

SIGNIFICANCE See STATISTICAL SIGNIFICANCE. 

Simpson's paradox A form of confounding, in which the pi 
variable changes the direction of an association. Simpsoi 
meta-analysis, because the sum of the data or results fro 
studies may be affected by confounding variables that ha 
sign features from some studies but not others; if this 
analysis will be flawed. Rothman 1 has pointed out that S 
really a paradox but the logical consequence of failing to i 
confounding variables. 

'Rothman KJ; A pictorial representation of confounding in epiden 

28:101-108, 1975. 

simulation The use of a model system, e.g., a mathematical m 
to approximate the action of a real system, often used to 
real system. 

situation analysis Study of a situation that may require in 
with a definition of the problem, and an assessment or m 
severity, causes, and impacts upon the community, and is 
interactions between the system and its environment an 
mance. 

skew distribution An older and less recommended term I 
quency distribution. If a unimodal distribution has a lon$ 
lower values of the variate, it is said to have negative skewi 
positive skewness. See also log-normal distribution. 
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Skew distribution of attack rate of measles in relat 
Fnm Lilienleld and Ulienfcld, 1979. 

slow virus Agent causing degenerative (neurological) diseast 
incubation period and a prolonged, slowly progressive cou 
firmed slow virus diseases are Creuuleldt-Jakob disease 
rosis is possibly a slow virus disease. Some cases of AIDS 
ease, 
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Snow, John (1813-1858) London general practitioner and early anesthetist (he assisted 
Queen Victoria s delivery or two of her children with chloroform). His fame rests 
upon his observations, brilliant deductions, painstaking personal enquiries, and an* 
alytic studies of cholera outbreaks in the mid* 19th century in London and else¬ 
where. All arc recorded in On the Mode of Communication of Cholera (London: Chur¬ 
chill, 2nd ed., 1855), which can be regarded as the first definitive working text on 
epidemiology and which also contained an explicit statement of the germ theory of 
transmission, written 30 years before Koch discovered the cholera vibrio. Sec also 

NATURAL EXPERIMENT. 

social CLASS A stratum in society composed of individuals and families of equal stand¬ 
ing. See also socioeconomic classification, 
social DRIFT Downward social class mobility as a result of impaired health often due 
to mental disorders. 

social medicine The practice of medicine concerned with health and disease as a func¬ 
tion of group living. Social medicine is concerned with the health of people in 
relation to their behavior in social groups and as such involves care of the individual 
patient as a member of a familv and ol other significant groups in everyday life. It 
is also concerned with the health of these groups as such and with that of the whole 
community as a community. See also community medicine; public: health, 
socioeconomic CLASSIFICATION Arrangement of persons into groups according to such 
characteristics as prior education, occupation, and income. This usually reveals upon 
analvsis a strong correlation with health-related characteristics such as average length 
of life and risk of dying from certain specific causes. 

The oldest such classification that is epidemiologically useful is the Registrar- 
General's (RG's) occupational classification, developed in 1911 by Stephenson, 
Registrar-General of England and Wales. This classified all occupations into five 
groups—the five "social classes." Social class III is often further subdivided into 
nonmanual and manual groups: 

I Professional occupations 
II Intermediate occupations 
II In Nonmanual skilled occupations 
111m Manual skilled occupations 
IV Partly skilled occupations 
V Unskilled occupations 

This has proven to be a valuable epidemiologic tool; social class is an accurate, 
consistent predictor of health experience. 

There have been several other attempts to develop a more refined classification; 
however, most refinements require collection of more detailed information. For 
example, Hollingshead's scale requires details about education and income as well 
as occupation, and so is more time-consuming, more likely to be incomplete, and 
requires more cosily analysis than the RG's classification. In developing countries, 
where up to 90% of the population may be classified under "agriculturalist" or 
"pastoralist" (farming or herding), other types of classifications have been devel¬ 
oped. 

One's prestige in society, and attitudes or values, e.g., setting a high value on 
getting a good education, are generally an integral part of social class or socioeco¬ 
nomic status. Attitudes toward health are often part of the set of values and may 
explain pan of the observed difference in health between social classes. 
socioeconomic status (ses) Descriptive term for a person's position in society, which 


may be expressed on an ordinal scale using such criteria as income, educational 
level attained, occupation, value of dwelling place, etc. 
software See computer. 

soundex code A sequence of letters used for recording names phonetically, especially 

in RECORD LINKAGE. 

source of infection The person, animal, object, or substance from which an infec¬ 
tious agent passes to a host. Source of infection should be clearly distinguished 
from source of contamination, such as overflow of a septic tank contaminating a 
water supply, or an infected cook contaminating a salad. (See reservoir.) 1 
'Front Control of Ctmmumcablf tUsetue in Man. Mth ed. Washington DC: American Public Health 
Association. 1985. 

SPEARMAN’S RANK CORRELATION See CORRELATION COEFFICIENT. 

SPECIFICATION 

1. The process of selecting a particular functional form or model for the rela¬ 
tionships to be analyzed in a study. 

2. The process of selecting variables for inclusion in the analvsis of an effect or 
association. This process leads lo the identification of moderator variables 

and CONFOUNDING VARIABLES. Sec also STRATIFICATION. 

SPECIFICITY (OR A TEST) See SENSITIVITY AND SPECIFICITY. 

spectrum of disease The full range of manifestations of a disease: a vague term, that 
can mean everything from mild or subclinical or precursor states to fulminating, 
florid disease, or alternatively the natural history of a disease from onset to resolu- 
tion. 

spell of sickness An episode of sickness with a well-defined onset and termination. 
As used in the monitoring or surveillance of disease, the spell is often defined by 
the duration of absence from work or school. 
spleen rate A term used in malaria epidemiology, to define the frequency of enlarged 
spleens delected on survey of a population in which malaria is prevalent. In asso¬ 
ciation with the Hackett spleen classification it summarizes the severity of ma¬ 
laria endcmicitv. 

sporadic Occurring irregularly, haphazardly from lime to time, and generally infre¬ 
quently, e g., cases of certain infectious diseases. 
spot map Map showing the geographic location of people with a specific attribute, e.g., 
cases of a disease or elderly persons living alone. The making of a spot map is a 
common procedure in the investigation of a localized outbreak of disease. Infer¬ 
ences from such a map depend on the assumption that the population at risk of 
developing the disease is fairly evenly distributed over the area, or that at least the 
heterogeneities are known and can be considered in interpreting the map. 
stable population A population that has constant fertility and mortality rates, n< 
migration, and consequently a fixed age distribution and constant growth rate. Set 
also stationary population. 

standard Something that serves as a basis for comparison; a technical specification o 
written report drawn up by experts based on the consolidated results of scientifi 
study, technology, and experience, aimed at optimum benefits and approved by 
recognized and representative body. 

standard deviation A measure of dispersion or variation. It is the most widely us« 
measure of dispersion of a frequency distribution. It is equal lo the positive squa 
hoot of the variance. The mean tells where the values for a group are ccntcrc 
The standard deviation is a summary of how widely dispersed the values are arou 
this center. I 

standard ierror The standard deviation of an estimate, 
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standardization A set of techniques used to remove as far as possible the effects of 
differences in age or other confounding variables, when comparing two or more 
populations. The common method uses weighted averaging of rates specific for 
age, sex, or some other potential confounding variable^) according to some speci¬ 
fied distribution of these variables. There arc two main methods, as follows: 

Direct method: The specific rates in a study population are averaged, using as weights 
the distribution of a specified standard population. The directly standardized rate 
represents what the crude rate would have been in the study population if that 
population had the same distribution as the standard population with respect to the 
variabte(s) for which the adjustment or standardization was carried out. 

Indirect method: This is used to compare study populations for which the specific 
rates are either statistically unstable or unknown. The specific rates in the standard 
population are averaged, using as weights the distribution of the study population. 
The ratio of the crude rate lor the study population to the weighted average so 
obtained is the standardized mortality for morbidity) ratio, or SMK. The indirectly 
standardized rale itself is the product of the SMK and the crude rate lor the stan¬ 
dard population. 

standardized mortality (morridity) ratio (smr) The ratio of the number or events 
observed in the study group or population to the number that would be expected 
if the study population had the same specific rates as the standard population, mul¬ 
tiplied by i (HI. 

standardized rate ratio (srr) A rate ratio in which the numerator and denominator 
rates have been standardized to the same (standard) population distribution. 
standard metropolitan statistical area Because of the extensive interactions be¬ 
tween a city and its surrounding areas, a unit encompassing both is needed as a 
base for statistical description. The concept of a standard metropolitan statistical 
area (SMSA) was introduced in the United States to f urnish such a unit. To qualify 
as an SMSA an area has to meet criteria related to size, social and economic inte¬ 
gration of the city and surrounding county or counties, minimum population den¬ 
sity. and minimum proportion of the labor force engaged in nonagricultural work. 
stationary population A stable population that has a zero growth rate with constant 
numbers of births and deaths each year. 

statistics The science and art of collecting, summarizing, and analyzing data that are 
subject to random variation. The term is also applied to the data themselves and to 
summarizations of the data. Statistical terms are defined by Kendall and Buckland. 1 
'Kendall MG. Buckland WR: A Dictionary of Statistical Term , 4th cd. London: Longman. 1982. 
STATISTICAL ERROR See ERROR. 

STATISTICAL INFERENCE See INFERENCE. 

STATISTICAL MODEL See MATHEMATICAL MODEL. 

statistical SIGNIFICANCE Statistical methods allow an estimate to be made of the prob¬ 
ability of the observed or greater degree of association between independent and 
dependent variables under the null hypothesis. From this estimate, in a sample of 
given size, the statistical Significance** of a result can be stated. Usually the level of 
sutistical significance is stated by the P value, 
statistical test A procedure that is intended to decide whether a hypothesis about 
the distribution of one or more populations or variables should be rejected or ac¬ 
cepted. Statistical tests may be parametric or nonparametric. 
stereogram (Syn: isometric chart) A graph or chan that displays more than two vari¬ 
ables in a manner that appears three-dimensional to the eye. 
stochastic process A process that incorporates some element of randomness. 
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strategy In game theorv, a mathematical function. 

stratification The process of or result of separating a sample into several subsamples 
according to specified criteria such as age groups, socioeconomic status, etc. The 
effect of confounding variables may be controlled by stratifying the analysis of re¬ 
sults. For example, lung cancer is known to be associated with smoking. To examine 
the possible association between urban atmospheric pollution and lung cancer, con¬ 
trolling for smoking, the population may be divided into strata according to smok¬ 
ing status. The association between air pollution and cancer can then be appraised 
separately within each stratum. Stratification is used not only to control for con¬ 
founding effects but also as a way of delecting modifying effects. In this example, 
stratification makes it possible to examine the effect of smoking on the association 
between atmospheric pollution and lung cancer. 

•tratified randomization (Syn: blocked randomization) A randomization procedure 
in which strata are identified and subjects randomly allocated within each. This 
produces a situation intermediate between paired allocation and simple random 
allocation. 

study design See research design, 
surcunical disease See disease, subcunical. 

surveillance Ongoing scrutiny, generally using methods distinguished by their prac- 
ticabiliu, uniformity, and frequently their rapidity, rather than by complete accu¬ 
racy. Its main purpose is to delect changes in trend or distribution in order to 
initiate investigative or control measures. See also monitoring, 
surveillance or disease The continuing scrutiny of all aspects of occurrence and 
spread of a disease that are pertinent to effective control. 

Included are the systematic collection and evaluation of (I) morbidity and mor¬ 
tals re|Kirts, (2) special reports of field investigations of epidemics and of individ¬ 
ual cases. (3) isolation and identification of infectious agents by laboratories, (4) data 
concerning the availability, use, and untoward effects of vaccines and toxoids, im¬ 
mune globulins, insecticides, and other substances used in control, (5) information 
regarding immunity levels in segments of the population, and (6) other relevant 
epidemiologic data. A report summarizing these data should be prepared and dis¬ 
tributed to all cooperating persons and others with a need to know the results of 
the surveillance activities. The procedure applies to all jurisdictional levels of public 
health from local to international. 1 Serological surveillance identifies patterns of 
current and past infection using serological lest. Sec also seroepidemiology. 

1 Benenson AS (Ed.): Control of Communicable Diseases m Man, I4lh ed. Washington DC: American 
PuWk Health Association. 1985. 

survey An investigation in which information is systematically collected but in which 
the experimental method is not used. A population survey may be conducted by 
face-to-face inquiry , by self-completed questionnaires, by telephone, postal service, 
or in some other way. Each method has its advantages and disadvantages. For in¬ 
stance. a face-to-face (interview) survey may be a better way than self-completed 
questionnaire to collect information on attitudes or feelings, but it is more cosllv. 
Existing medical or other records may contain accurate information, but not about 
a representative sample of the population. 

The information that is gathered in a survey is usually complex enough to re¬ 
quire editing (for accuracy, completeness, etc.), coding, keypunching, i.e., entry on 
punch cards and processing and analysis by machine or computer. The generaliz- 
ability of results depends upon the extent to which the surveyed population is rep¬ 
resentative. 
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The term "survey" is sometimes used in a narrow sense to refer specifically to a 

FtELO SURVEY. 

survey instrument The interview schedule, questionnaire, medical examination re¬ 
cord form, etc., used in a survey. 

survival analysis A class of statistical procedures for estimating the survival func¬ 
tion, and for making inferences about the effects on it of treatments, prognostic 
factors, exposures, and other covariates. 

survival curve A curve that suns at 100% of the study population and shows the 
percentage of the population still surviving at successive times for as long as infor¬ 
mation is available. Mav be applied not only to survival as such, but also to the 
persistence of freedom from a disease, or complication or some other endpoint. 
survival function (Svn: survival distribution) A function of time, usually denoted by 
S(U. that sum with a population 100% well at a particular time and provides the 
percenugc or the population still well at later times. Survival functions may be 
applied to any discrete event, for example, disease incidence or relapse, death, or 
recovery after onset of disease (in which case the population is initially 100% dis¬ 
eased. and the "survival" function gives the percentage still diseased). 
survival rate (Svn: cumulative survival rale) The proportion or survivors in a group, 
e.g.. of patients, studied and followed over a period. The proportion of persons in 
a specified group alive at the beginning of the time interval (e.g., a hvc-year period) 
who survive to the end of the interval. It is equal to I minus (hr cumulative mor¬ 
tality rate. May be studied by current or cohort ufe table methods. 
survival ratio The probability of surviving between one age and another: when com¬ 
puted for age groups, the ratios correspond to those of the pcrson-years-lived func¬ 
tion of a life uble. 

survivorship study Use of a cohort ufe table to provide the probability that an event, 
such as death, will occur in successive intervals of time after diagnosis and, con¬ 
versely. the probability of surviving each interval. The multiplication of these prob¬ 
abilities of survival for each time interval for those alive at the beginning of that 
interval fields a cumulative probability of surviving for the total period o( study. 
Sydenham, Thomas (1624—1689) A great English physician in the tradition of Hippo¬ 
crates and one of the founding fathers of epidemiology (although his ideas about 
the meteorological causes of epidemics were wrong). His writings contain many 
careful and comprehensive accounts of important epidemic diseases, notably pla¬ 
gue. malaria, measles, dysentery, and scarlet fever. His Opera Omni have been twice 
translated into English: the second (and better) two-volume translation by Latham 
was published by the Sydenham Society in 1846-1850. 
symbiosis The biological association of two or more species to their mutual benefit. 
symmetrical relationsmip An association between variables that does not have direc¬ 
tion. 

The following four varieties can be distinguished: 

1. Functional interdependence, where one variable cannot exist without the other: 
e.g., prevalence is a function of incidence and duration. 

2. Common complex, where variables occur together without being interdepen¬ 
dent or necessary to each oilier: e.g., the occurrence together of air pollution, 
poverty, poor housing, and overcrowding. 

3. Alternative indicators of the same entity; e.g., antibodies to a microorganism 
and history of specific infection caused by that microorganism. 

4; The efiects of a common cause; e.g., clinical and biochemical changes in hep¬ 
atitis. 

See also association, symmetrical. 
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syndrome A symptom complex in which the symptoms and/or signs coexist more fre¬ 
quently than would be expected by chance on the assumption of independence. 

SYNERGISM* synercy The definition of synergism in epidemiology is somewhat contro¬ 
versial. We offer two definitions, the first a common dictionary definition, the sec¬ 
ond a more specific definition encountered in bioassay. 

1. A situation in which the combined effect of two or more factors is greater 
than the sum of their solitary effects. 

2. Two factors act synergistically if there are persons who will get the disease 
when exposed to both factors but not when exposed to either alone, antago¬ 
nism. the opposite of synergism, exists if there are persons who will get the 
disease when exposed to one of the factors alone, but not when exposed to 
both. Note that under these definitions two factors may act synergistically in 
some persons and antagonistically in others. 

SYSTEMATIC ERROR See BIAS. 

systems analysis This term is used with three similar meanings. 

1. The examination of various elements of a system with a view to ascertaining 
whether the proposed solution to a problem will fit into the system and, in 
turn, effect an overall improvement in the system. 

2. The analysis of an activity in order to determine precisely what is required of 
the system, how this can best be accomplished, and in what ways the computer 
can be useful. 

3. Systems analysis refers to any formal analysis whose purpose is to suggest a 
course of action by systematically examining the objectives, costs, effectiveness 
and risks of alternative policies or strategies and designing additional ones if 
those examined are found wanting. It is an approach to or way of looking at 
complex problems of choice under uncertainty; it is not yet a method. 
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Takaki, Kanehiro (1849-1915) Japanese nobleman who studied medicine at St Tho¬ 
mas s Hospital Medical School, London. He became a naval surgeon, and later used 
his opportunity as director of naval medical services to conduct large-scale dietary 
experiments on populations of naval personnel, demonstrating that beriberi could 
be prevented by a mixed diet containing protein as well as rice. 

TARGET POPULATION 

1. The collection of individuals, items, measurements, etc., about which we want 
to make inferences. The term is sometimes used to indicate the population 
from which a sample is drawn and sometimes to denote any "reference’* pop- 
ulation about which inferences arc required. 

2. The group of persons for whom an intervention is planned. 
taxonomy A systematic classification into related groups. 

taxonomy or disease The orderly classification of diseases into appropriate categories 
on the basis of relationships among them, with the application of names. See also 
NOSOGRAFHY, NOSOLOGY. 

(•distribution, (.test The /-distribution is the distribution of a quotient of indepen¬ 
dent random variables, the numerator of which is a standardized normal variate 
and the denominator of which is the positive square root of the quotient of a chi- 
square distributed variate and its number of degrees of freedom. The Men uses a 
statistic that, under the null hypothesis, has the /-distribution, to test whether two 
means differ significantly, or to test linear regression or correlation coefficients. 
The /-distribution and the /-test were developed by WS Gossett, who wrote under 
the pseudonym "Student** as his employment precluded individual publication. 
teratogen A substance that produces abnormalities in the embryo or fetus by disturb¬ 
ing maternal homeostasis or by acting directly on the fetus in utcro. 
test or significance See r value; statistical significance, 
test hypothesis See null hypothesis. 

theoretical epidemiology The development of mathematical/statistical models to ex¬ 
plain different aspects of the occurrence of a variety of diseases. With some infec¬ 
tious diseases, models have been generated to elucidate the reasons Tor epidemics 
and/or to predict the behavior of the disease in reaction to given control mea- 
sures.See also model, 
therapeutic trial See clinical trial, 
threshhold unit value Sec safety standards. 

threshold phenomena Events or changes that occur only after a certain level of a 
characteristic is reached. 
time cluster See clustering. 
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time-place cluster See clustering. 

total fertility rate (tfr) The average number of children that would be born per 
woman if alt women lived to the end of their childbearing years and bore children 
according to a given set of age-specific fertility rates. It is computed by summing 
the age-specific fertility rates for all ages and multiplying by the interval into which 
the ages arc grouped. The TFR is an important fertility measure, providing the 
most accurate answer to the question, "How many children does a women have, on 
average?" 

tracer disease method Tracer or indicator conditions as defined by Ressner are 
easily diagnosed, reasonably frequent illnesses or health states whose outcomes are 
believed to be affected by health care and which taken in aggregate should reflect 
the gamut of patients and health problems encountered in a medical practice. The 
extent to which the recorded care of these conditions concurs with preset sundards 
of care is used as an index of the quality of care delivered. However, it should first 
be shown that the preset sundards contribute to a favorable outcome. See also 

SENTINEL HEALTH EVENT. 

1 Ressner DM, Snow CK. Singer J: Assessment of Medical Cart for Children . Washington DC: National 

Academy of Sciences. Institute of Medicine, 1974. 

transmission of infection Transmission of infectious agents. Any mechanism by which 
an infectious agent is spread through the environment or to another person. These 
mechanisms are defined in Control of Communicable Disease in Man 1 as follows: 

a. Direct transmission 

Direct and essentially immediate transfer of infectious agents (other than 
from an arthropod in which the organism has undergone essential multipli¬ 
cation or development) to a receptive portal of entry through which human 
infection may uke place. This may be by direct contract as by touching, kiss¬ 
ing. or sexual intercourse, or by the direct projection (droplet spread) of drop¬ 
let spray onto the conjunctiva or onto the mucous membranes of the nose or 
mouth during sneezing, coughing, spitting, singing, or ulking (usually limited 
to a disunce of about I m or less). It may also be by direct exposure of sus¬ 
ceptible tissue to an agent in soil, compost, or decaying vegetable matter in 
which it normally leads a saprophytic existence. (e.g„ the systemic mycoses), 
or by the bite of a rabid animal. Transplacental transmission is another form 
of direct transmission. 

b. Indirect transmission 

Vehkle*bomt —Contaminated materials or objects (fomites) such as toys, 
handkerchiefs, soiled clothes, bedding, cooking or eating utensils, and surgical 
instruments or dressings (indirect contact); water, food, milk, biological prod¬ 
ucts including blood, scrum, plasma, tissues, or organs; or any substance serv¬ 
ing as an intermediate means by which an infectious agent is transported and 
introduced into a susceptible host through a suitable portal of entry. The agent 
may or may not have multiplied or developed in or on the vehicle before 
being transmitted. 

Vector'borne—{\) Mechanical: Includes simple mechanical carriage by a crawl¬ 
ing or flying insect through soiling of its feet or proboscis, or by passage of 
organisms through its gastrointestinal tract. This does not require multiplica¬ 
tion or development of the organism. (2) Biologkal: Propagation (multiplica¬ 
tion), cyclic development, or a combination of these (cydopropagative) is re¬ 
quired before the arthropod can transmit the infective form of the agent to 
man. An incubation period (extrinsic) is required following infection before 
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the arthropod becomes infective. The infectious agent may be passed vertically 
to succeeding generations (transovarian transmission); transsudial transmis¬ 
sion is its passage from the one stage of the life cycle to another, as nymph to 
adult. Transmission may be by saliva during biting or by regurgitation or dep¬ 
osition on the skin of feces or other material capable of penetrating subse¬ 
quently through the bite wound or through an area of trauma from scratching 
or rubbing. This is transmission by an infected nonvertebrate host and must 
be differentiated for epidemiologic purposes from simple mechanical carriage 
by a vector in the role of a vehicle. An arthropod in either role is termed a 
"vector." 

Airborne —?The dissemination of microbial aerosols to a suitable portal of en¬ 
try, usually the respiratory tract. Microbial aerosols are suspensions in the air 
of particles consisting partially or wholly of microorganisms. Particles in the 
I-5 #a range are easily drawn into the alveoli of the lungs and may be retained 
there; many are exhaled from the alveoli without deposition. They may re¬ 
main suspended in the air for long periods of time, some retaining and others 
losing infectivity or virulence. Not considered as airborne are droplets and 
other large panicles that promptly settle out (see Direct transmission, above). 

The following are airborne and their mode of transmission is direct: 

Droplet nuclei: Usually the small residues that result Irom evaporation of 
fluid from droplets emitted by an infected host (see above). Droplet nuclei also 
may be created purposely by a variety of atomizing devices, or accidentally as 
in microbiology laboratories or in abattoirs, rendering plants, or autopsy rooms. 
They usually remain suspended in the air for long periods of time. 

Dust; The small panicles of widely varying size that may arise from soil (as, 
for example, fungus spores separated from dry soil by wind or mechanical 
agitation), clothes, bedding, or contaminated floors. 1 See also acquaintance 

NETWORK; AIK-BORNE INFECTION; CARRIER; COMMON VEHICLE SPREAD; CONTACT; 
CONTAMINATION; DROPLET NUCLEI. 

'Benenson AS (Ed.): Control of Communicable Diseases ni Man, Hth ed. Washington DC: American 

Public Health Association, 1985. 

TRANSOVAR1AL TRANSMISSION See VECTOR-BORNE INFECTION. 

TRANSPORT HOST See PARATENIC HOST. 

trend A long-term movement in an ordered series, e.g., a time series. An essential 
feature is that the movement, while possibly irregular in the shon term, shows 
movement consistently in the same direction over a long term. The term is also 
used loosely to refer to an association which is consistent in several samples or strata 
but is not statistically significant. 

trend une That line that best flu the distribution of a set of values plotted on two 
axes. 

trial See clinical trial. 

trohoc study A retrospective case-control study. The term, proposed by AR Fein- 
stein, 1 is the inversion of "cohort;" iu use is deprecated by the great majority of 
epidemiologists. 

'Clm Pharmacol Ther 30:564-577, 1981. 

TYPE I ERROR See ERROR. 

type II ERROR See ERROR. 

twin study Method of defecting genetic etiology in human disease. The basic premise 
of twin studies is that monozygotic twins, being formed by the division of a single 
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fertilized ovum, carry identical genes, while dizygotic twins, being formed by the 
fertilization of two ova by two different spermatozoa, are genetically no more sim¬ 
ilar than two siblings born after separate pregnancies. 

TWO-TAIL TEST A statistical significance test based on the assumption that the data are 
distributed in both directions from some central value(s). 
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unbiassed estimator An estimator that for all sample sizes has an expected value equal 
to the parameter being estimated. If an estimator tends to be unbiassed as sample 
size increases, it is referred to as asymptotically unbiassed. 

UNDERLYING CAUSE OF DEATH See DEATH CERTIFICATE. 

underreporting Failure to identify and/or count all cases, leading to reduction of nu¬ 
merator in a rate. See also error. 

UTtUTY In economics, this means satisfaction derived from obtaining some quantity of 
a specified article of commerce. When used in decision theory or cunical decision 
analysis, the meaning is essentially the same, and can be expressed as the useful¬ 
ness or desirability of an outcome resulting from a decision. 
vaccination Strictly speaking, vaccination refers to inoculation (from Latin rn oculus , 
into a bud) with vaccinia virus against smallpox. Nowadays the word is broadly used 
synonymously with procedures for immunization against all infectious disease. 
vaccine lmmunobiologica! substance used for active immunization by introducing into 
the body a live modified, attenuated, or killed inactivated infectious organism or its 
toxin. The vaccine is capable of stimulating immune response by the host, who is 
thus rendered resistant to infection. The word "vaccine” was originally applied to 
the serum from a cow infected with vaccinia virus (cowpox; from Latin vacca t cow); 
it is now used of all immunizing agents. 
validation The process of establishing that a method is sound. 
validity This term, derived from the Latin validus , strong, has several meanings, usu¬ 
ally accompanied by a qualifying word or phrase. 
validity, measurement An expression of the degree to which a measurement mea¬ 
sures what it purports to measure. 

Several varieties are distinguished, including construct validity, content validity, 
and criterion validity (concurrent and predictive validity). 

Comtruct validity: The extent to which the measurement corresponds to theoreti¬ 
cal concepts (constructs) concerning the phenomenon under study. For example, if 
on theoretical grounds, the phenomenon should change with age, a measurement 
with construct validity would reflect such a change. 

Content validity: The extent to which the measurement incorporates the domain 
of the phenomenon under study. For example, a measurement of functional health 
status should embrace activities of daily living, occupational, family, and social func¬ 
tioning, etc. 

Criterion validity: The extent to which the measurement correlates with an exter¬ 
nal criterion of the phenomenon under study. Two aspects of criterion validity can 
be distinguished: 

1. Concurrent validity: The measurement and the criterion refer to the same point 
in time. An example would be a visual inspection of a wound for evidence of 
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infection validated against bacteriological examination of a specimen taken at 
the same time. 

2. Predictive validity: The measurement’s validity is expressed in terms of its abil¬ 
ity to predict the criterion. An example would be an academic aptitude test 
that was validated against subsequent academic performance. 
validity, study The degree to which the inference drawn from a study, especially 
generalizations extending beyond the study sample, are warranted when account is 
uken of the study methods, the representativeness of the study sample, and the 
nature of the population from which it is drawn. Two varieties of study validity are 
distinguished: 

1. Internal validity: The index and comparison groups are selected and compared 
in such a manner that the observed differences between them on the depen¬ 
dent variables under study may, apart from sampling error, be attributed only 
to the hypothesized effect under investigation. 

2. External validity (generalixabihty): A study is externally valid or generalizable if 
it can produce unbiased inferences regarding a target population (beyond the 
subjects in the study). This aspect of validity is only meaningful with regard 
to a specified external target population. For example, the results of a study 
conducted using only white male subjects might or might not be generalizable 
to all human males (the target population consisting of all human males). It is 
not generalizable to females (the target population consisting of all people). 
The evaluation of generalizability usually involves much more subject-matter 
judgment than internal validity. 

These epidemiologic definitions of the terms "internal validity” and "external va¬ 
lidity" do not correspond exactly to tome definitions found in the sociological lit¬ 
erature. 

variable Any quantity that varies. Any attribute, phenomenon, or event that can have 
different values. 

variable, antecedent A variable that causally precedes the association or outcome 
under study. See also explanatory variable; independent variable, 
variable, confounding See confounding. 

variable, control Independent variable other than the "hypothetical causal variable" 
that has a potential effect on the dependent variable and is subject to control by 
analysis. 

variable, dependent See dependent variable. 

variable, distorter A confoundinc variable that diminishes, masks, or reverses the 
association under study. 

variable, experiential See independent variable, 
variable independent See independent variable, 
variable, intervening See intervening variable, 
variable, manifestational See dependent vabiable. 
variable, moderator See effect modifier. 

VARIABLE, PASSENGER See PASSENGER VARIABLE. 

variable, uncontrolled A (potentially) confounding variable that has not been brought 
under control by design or analysts. See also confounding, 
variance A measure of the variation shown by a set of observations, defined by the 
sum of the squares of deviations from the mean, divided by the number of degrees 
or freedom in the set of observations. 

variate (Syn: random variable) A variable that may assume any of a set of values, each 
with a preassigned probability (known as its distribution). 
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1. In infectious disease epidemiology, an insect or any living carrier that trait* 
pom an infectious agent from an infected individual or its wastes to a suscep¬ 
tible individual or its food or immediate surroundings. The organism may or 
may not pass through a developmental cycle within the vector. 

2. In statistics, an ordered set of numbers representing the values of a set of 
variables, 

vector-dorne infection Several classes of vector-borne infections are recognised, each 
with epidemiologic features that are determined by the interaction between the 
infectious agent and the human host, on the one hand, and the vector on the other. 
Therefore, environmental factors such as climatic and seasonal variations influence 
the epidemiologic pattern by virtue of their effects on the vector and its habits. 

The terms used to describe specific features of vector-borne infections are: 

Biological transmission: Transmission of the infectious agent to susceptible host hy 
bite of blood-feeding (arthropod) vector as in malaria, or by other inoculation, as 
in ScAutoioiw infection. 

Extrinsic incubation period: Time necessary after acquisition of infection by the (ar¬ 
thropod) vector for the infectious agent to multiply or develop sufficiently so that 
it can be transmitted by the vector to a vertebrate hott. 

Hibernation: A possible mechanism by whkh the infected vector survives adverse 
cold weather by becoming dormant. 

Inapporent infection: Response to infection without developing overt signs of ill¬ 
ness. If this is accompanied by viremia or bacteremia in a high proportion of in¬ 
fected animals or persons, the receptor species is well suited as an epidemiotogkally 
important host in the transmission cycle. 

Mechanical irammtuion: Transport of the infectious agent between hosts by ar¬ 
thropod vectors with contaminated mouthparo, antennae, or limbs. There is no 
multiplication of the infectious agent in the vector. 

Overwintering: Persistence of the infectious microorganism in the vector for ex¬ 
tended periods, such as the cooler winter months, during which the vector has no 
opportunity to be reinfected or to infect a vertebrate host. Overwintering is an 
important concept in the epidemiology of vector-borne diseases since the annual 
recrudescence of viral activity after periods (winter, dry season) adverse to contin¬ 
ual transmission depends upon a mechanism for local survival of an infectious mi¬ 
croorganism or its reintroduction from outside the endemic area. To some extent, 
the risk of a summertime epidemic may be determined by the relative success of 
microorganism survival in the local winter reservoir. Since overwinter survival may 
in turn depend upon the level of activity of the microorganism during the preced¬ 
ing summer-fall, outbreaks sometimes occur for two or more successive years. 

Transovarial infection (transmission): Transmission of the infectious microorganism 
from the affected female arthropod to her progeny. 

vector space An area (or volume) defined by the specified dimensions of two (or 
three) vectors. 

vehicle or infection transmission The mode of transmission of an infectious agent 
from its reservoir to a susceptible host. This can be person-to-person, food, vector- 
borne, etc. 

Venn diagram A pictorial presentation of the extent to which two or more quantities 
or concepts are mutually inclusive and mutually exclusive. 

Virchow, Rudolt (1821-1902) Bom in Pomerania, Virchow graduated in medicine 
from Berlin in 1843 and rapidly established his reputation as the leading medical 
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scientist of his time. Modem pathology owes much to his rigorous use of hypothesis- 
testing methods, illustrated in his first paper in the journal he founded. AreAw/ur 
pathologische Analomie, now universally known as Virchow’s Archives. Virchow was 
also a practicing epidemiologist, who investigated a serious epidemic of typhus in 
Silesia in 1848; his recommendations for hygienic and social reform got him into 
trouble with the government, but his scientific brilliance made it impossible for the 
authorities not to recognize and reward him with promotions and honors. He en¬ 
tered Parliament in 1862, and during the Franco-Prussian War he organized an 
ambulance service. He made many contributions of fundamental importance to the 
science of pathology, but deserves to be remembered as a great humanitarian as 

well. • i ■ f * 

virgin population A population that has never been exposed to a particular infectious 

agent. . 

virulence The degree of pathogenicity; the disease-evoking power or a microorgan¬ 
ism in a given host. Numerically expressed as the ratio of the number of cases of 
overt infection in the total number infected, as determined by immunoassay. When 
death is the only criterion of severity, this is the case-fatality rate. 
vital records (Literally, 'To do with living") Certificates of birth, death, marriage, 
and divorce required for legal and demographic purposes. 
vital statistics Systematically tabulated information concerning births, marriages, di¬ 
vorces, separations, and deaths based on registrations of these vital events. 
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washout phase Thai stage in a study, especially a therapeutic trial, when treatment is 
withdrawn so that its effects disappear and the subject s characteristics return to 
their baseline slate. 

worm count A method of surveillance of helminth infection of the gut that depends 
upon counts of worms, or their cysts or ova, in quantitatively titrated samples of 
feces. Other terms used to describe this form of surveillance are "egg count/' "cyst 
count." and count." 

Wu, Lien-Teh (1879-1 WO) Chinese epidemiologist, responsible for controlling the plague 
pandemic in Manchuria in 1910-11. Later he worked on control of sexually trans* 
mitted diseases and other socioeconomically determined health conditions, devel¬ 
oped a national quarantine service and was one of the founders of the Chinese 
Medical Association, thus helping to lay the foundations for health improvements 
in nu»dern China. 

xenoriotic 

1. (Svn: commensal, symbiosis) Pertaining to association of two animal species, 
usually insects, in the absence of a dependency relationship, as opposed to 
parasitism. 

2. A foreign compound that is metabolized in the body. Many pesticides and 
their derivatives, some food additives and a number of other complex organic 
compounds such as dioxins and PCBs. are xenobioiks. 

XENODiAGNOSis Detection of a (human) pathogenic organism by allowing a noninfected 
vector (e.g.. mosquito) to consume infected material, and then examining this vec¬ 
tor for evidence of the pathogen. 

Yates* correction An adjustment proposed by Yates (1934) in the chi-square calcu¬ 
lation for a 2x2 table, which brings the distribution based on discontinuous fre¬ 
quencies closer to the continuous chi-square distribution from which the published 
tables for testing chi-squares are derived 

YEARS or POTENTIAL LIFE LOST (YPLL) See fOTENTIAL YEARS OF LIFE LOST. 

yield The number or proportion of cases of a condition accurately identified by a 
screening lest. 

Youden's index When assessing screening tests, in the uncommon case where the risk 
of a false negative and that of a false positive result are assumed to be equivalent 
(i.e., specificity and sensitivity assumed to be equally important), it may be possible 
to compare screening tests through the Youden index based on the sum of specific¬ 
ity and sensitivity: 

Youden Index *y* specificity + sensitive - I 

withy ranging from zero (specificity*0.50 and sensitivity*0.50) to I (sensitiv¬ 
ity * 1.00. specificity * 1.00). 
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zero-time shift This concerns the selection of a starting point for the measurement 
of survival following the detection of disease. It is a jargon term, denoting the 
movement "backward" (toward the surfing point of a disease) of time between on¬ 
set and detection, that may accompany use of a screening procedure. 
zoonosis An infection or infectious disease transmissible under natural conditions from 
vertebrate animals to man. Examples include rabies and plague. May be enzootic 
or epizootic. 
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Foreword 


JLCePORTERS play an essential role in communicating 
science to the public. In common with scientists, they desire 
accuracy. Although health and medicine provide many exciting 
stories, the biostatistics that scientists must use in their studies 
presents special problems for reporters. It gives uncommon and 
misleading meanings to common words like “significant," “con¬ 
sistent,* and “power? Mathematical statistics often produces re¬ 
sults that are disturbingly counterintuitive, at least at first, to 
laymen and scientists alike. In vital statistics and epidemiology, 
definitions often seem arbitrary, and slight changes make con¬ 
siderable differences in the findings. 

Science writers often take short courses in special topics 
such as biostatistics. I have taught in some of these courses and 
have been impressed by the seriousness of the participants. Nev¬ 
ertheless, they need some of this material in an accessible and 
permanent form. 

Victor Cohn of the Washington Post has prepared this man¬ 
ual to help all reporters cut through these statistical tangles. He 
wants to give them a guide to the ways that statistics can clarify 
facts or mystify the reader. 

Cohn’s book grew out of the Media Project of our Health 
Science Policy Working Group of the Division of Health Policy 
Research and Education at Harvard University. I am pleased 
that faculty members of the Harvard School of Public Health 
have been able to help him produce this book as a visiting fellow 











FOREWORD 


in 1978 and 1984 and as a contributor to the Health Science 
Policy Working Group. 

Through the Media Project, with the help of Jay Winsten, 
we have also examined sources of pressures on the science 
writer. 1 In the future we want to use what we have learned 
through many discussions with science writers to advise scien¬ 
tists on their role in the media. 

By such efforts, including this book, and by many similar 
efforts in this and other fields, scientists and writers may gradu¬ 
ally upgrade the whole communication system, scientific and 
journalistic. Thus we may dear the communication channel 
between science and the public. 

Frederick Mosteller 
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FACTS AND FIC 


Science is observation, experimentation, measurement, and all 
these involve numbers, whether we reporters pay attention to 
them or not. 

Statistics are used or misused even by people who tell us, “I 
don’t believe in statistics* then claim that all of us or most people 
or many do such and such. The question for reporters is, how 
should we not merely repeat such numbers, stated or implied, 
but also interpret them to deliver the best possible picture of 
reality? 

We can be better reporters if we understand how the best 
statisticians—the best figurers—figure. And if we learn a few 
questions to help us separate the wheat from the chaff. 

I do not say that telling the truth—describing reality—will 
then become easy, for we are constantly bombarded with sweep¬ 
ing claims in convincing wrappings, and the disputed subjects 
are endless. Medical and surgical treatments, radiation, pesti¬ 
cides, nuclear power, the probability of environmental disasters, 
the side effects of medicines—almost nothing seems settled. 

Like it or not, we must wade in. Whether we will it or not, 
we have in effect become part of the regulatory apparatus. Dr. 
Peter Montague of Princeton University tells us, ‘The environ¬ 
mental and toxic situation is so complex, we can’t possibly have 
enough officials to monitor it. Reporters help officials decide 
where to focus their activity!* 

“Journalists opened up* the Love Canal toxic waste issue by 
“independent investigation* according to Cornell University’s 
Dr. Dorothy Nelkin. The extensive press coverage contributed 
to investigations that eventually forced the re-staffing of the En¬ 
vironmental Protection Agency and the creation of a national 
toxic waste disposal program.* 1 

That very coverage, however, may also have stampeded 
public officials into hasty, Si-conceived studies that left un¬ 
answered the crucial question: Did the Love Canal wastes ac¬ 
tually cause birth defects and other physical problems? 2 The 
very way we report a medical or environmental controversy can 
affect the outcome. If we ignore a bad situation, the public may 


suffer. If we v 
“no danger* tl 
experimental 
false hope. 

It is not 
National Car 
refuse to cor 
think “carcinc 
persons prob 
cancers are ex 
most informe 
related main! 
and very pos 
percent of a! 
carcinogens - 
foods, air, ar 
When it 
issues, or wl 
making the s 
state or und< 
of iV 
stausu. 
terpretauon. 
evident; you 
negative]. A 
sterile is mor 
that apple p 
We also 
the space or 
news directc 
story yet.* E' 
done. In a : 
major south 
traction afte 
who worked 
numbers ftt 



ro 

cn 

o 

to 

o 

N) 






FACTS AND FIGURES: WE CAN DO BETTER 5 

suffer. If we write “danger* the public may quake. If we write 
“no danger" the public may be falsely reassured. If we paint an 
experimental medical treatment too brightly, the public is given 
false hope. 

It is not just what we write, it is what we emphasize. A 
National Cancer Institute survey indicated that many persons 
refuse to consider healthy changes in life-style because they 
think “carcinogens are everywhere in the environment.” Such 
persons probably have read or heard again and again that most . | 

cancers are environmentally related, although, in the opinion of J 

most informed scientists, most fatal “environmental” cancers are : 

related mainly to individual behavior, outstandingly smoking, - 

and very possibly diet. By various estimates, perhaps 5 to 15 f 

percent of all cancers are related to exposures to man-made ] 

carcinogens—chemicals we have inserted into the workplace, | 

foods, air, and water. 3 \ 

When it comes to such emotionally charged and complex \ 

issues, or when it simply comes to running for page one or \\ 

making the six o’clock news, the best among us sometimes over- i j 

state or understate. Philip Meyer, veteran reporter and author 
of Precision Journalism, writes, “Journalists who misinterpret 
statistical data usually tend to err in the direction of overin¬ 
terpretation. . . . The reason for this professional bias is self- 
evident; you usually can’t write a snappy lead upholding [the 
negative]. A story purporting to show that apple pie makes you 
sterile is more interesting than one that says there is no evidence 
that apple pie changes your life."* 

We also work fast, sometimes too fast, with severe limits on 
the space or time we may fill. We find it hard to tell editors or 
news directors, “I haven’t had enough time. I don’t have the 
story yet.” Even a long-term project or special may be hurriedly 
done. In a newsroom ‘long-term" may mean a few weeks. A 
major southern newspaper had to print a long, front-page re¬ 
traction after a series of front-page stories alleged that people 
who worked at or lived near a plutonium plant suffered in excess 
numbers from a blood disease. “Our reporters obviously had 
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confused statistics and scientific data,” the editor admitted. “We 
did not ask enough questions."® 

We tend to oversimplify. We may report, “A study showed 
that black is white" or “So-and-so announced that . . . when a 
study merely suggested that there was some evidence that such 
might be the case. We may slight or omit the fact that a scientist 
calls a result “preliminary* As scientific unsophisticates, we may 
confuse a study that merely suggests a hypothesis that should be 
investigated—very frequently the case—with a study that 
presents strong and conclusive evidence. 

We often omit essential perspective, context, or back¬ 
ground. Dr. Thomas Vogt of the Kaiser Permanente Center for 
Health Research tells of seeing the headline “Heart Attacks 
From Lack of *C" and then, two months later, “People Who 
Take Vitamin C Increase Their Chances of a Heart Attack." 6 
Both stones were based on limited, and far from conclusive, 
animal studies. 

Scientists who do poor studies or overstate their results 
deserve part of the blame. But bad science is no excuse for bad 
journalism. We tend to rely most on “authorities* who are either 
most quotable or quickly available or both, and they often tend 
to be those who get most carried away with their sketchy and 
unconfirmed but “exciting* data—or have big axes to grind, 
however lofty their motives. The cautious, unbiased scientist 
who says, “Our results are inconclusive* or “We don*t have 
enough data yet to make any strong statement" or “I don’t know* 
tends to be omitted or buried someplace down in the story. 

We are influenced too by intense and growing competition 
to tell the story first and tell it most dramatically I was once 
asked by a Harvard researcher, “Does competition affect the way 
you present a story?" I thought and had to answer, “We have to 
almost overstate. We have to come as close as we can within the 
boundaries of truth to a dramatic, compelling statement. A 
weak statement will go no place." Another reporter said, “The 
fact is, you are going for the strong [lead and story]. And, while 
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The Certainty 
of Uncertainty 


Too much of the science reporting in the pres* {blurs] what we’re sure of and 
what we’re not very sure of and what is inconclusive. The notion of tentative¬ 
ness tends to drop out of much reporting. 

—Dr. Harvey Brooks 


The only trouble with a sure thing is the uncertainty. 


—Author unknown 


JL HE first thing to understand about science is that it is 
almost always uncertain. A scientist, seeking to explain or un¬ 
derstand something—be it the behavior of an atom or the effect 
of the toxic chemicals at a Love Canal—usually proposes a 
hypothesis, then seeks to test it by experiment or observation. If 
the evidence is strongly supportive, the hypothesis may then 
become a theory or at some point even a law, like the law of 
gravity. 

A theory may be so solid that it is generally accepted. 
Example: the theory that cigarette smoking causes lung cancer, 
for which almost any reasonable person would say the case has 
been proved, for all practical purposes. The phrase “for all prac¬ 
tical purposes” is important, for scientists, being practical peo¬ 
ple, must often speak at two levels: the strictly scientific level 
and the level of ordinary reason that we require for daily guid¬ 
ance. 

Example: In June 1985, 16 forensic experts examined the 
bones that were supposedly those of the “Angel of Death," Dr. 
Josef Mengele. Dr. Lowell Levine, delegated by the Depart¬ 
ment of Justice, then said, 'The skeleton is that of Josef 
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THE CERTAINTi 


listener phoned in to exclaim, * They say 7 is a damned liar!" 

TTiey* of course may be different theys who arrive at dif¬ 
ferent conclusions about inconclusive evidence in a thousand 
areas: the role of fats and cholesterol in the diet, the effects of 
low-level radioactivity, the cause of the extinction of dinosaurs. 

Why so much uncertainty? Science is always a continuing 
story. Nature is complex, and almost all methods of observation 
and experiment are imperfect. "There are flaws in all studies" 
says Harvard’s Dr. Mandn Zden * There may be weaknesses, 
often unavoidable ones, in the way a study is designed or con¬ 
ducted. Observers are subject to human bias and error. Subjects 
fluctuate. Measurements fluctuate. 

Many studies are thus inconclusive, and virtually no single 
study proves anything, “Fundamentally^* writes Dr. Thomas 
Vogt, “all scientific investigations require confirmation, and un¬ 
til it is forthcoming all results, no matter how sound they may 
seem, are preliminary?* 4 

Medicine, in particular, is full of disagreement and con¬ 
troversy. “No clinical trial is ever perfect," Harvard’s Dr. John 
Bailar observes. Unlike new drugs, medical treatments and tests 
and surgical operations need not even be subjected to experi¬ 
mental studies before being applied. “Most treatments escape 
and will continue to escape rigorous evaluation" Bailar says.* 

The reasons are many: lack of funds to mount enough 
trials; lack of enough patients at any one center to mount a 
meaningful trial; the expense and difficulty of doing multicenter 
trials; the swift evolution and obsolescence of medical tech¬ 
niques; the fact that, with the best of intentions, medical data— 
histories, physical examinations, interpretations of tests, descrip¬ 
tions of symptoms and diseases—are notoriously inexact and 
vary from physician to physician; and the serious ethical obsta¬ 
cles to trying a new procedure when an old one is doing some 
good, or to experimenting on children, pregnant women, or the 
mentally ill. 

While all studies have flaws, some have more flaws than 
others. Study after study has found that many articles in the 
most prestigious medical journals are replete with shaky statis¬ 
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tics and lack of any explanation of such crucial matters as pa¬ 
tients’ complications and the number of patients lost to follow¬ 
up. Papers presented at medical meetings, many of them widely 
reported by the media, are even 1ess reliable. Many papers are 
mere progress reports on incomplete studies. Some state tenta¬ 
tive results that later collapse. Some are given to draw comment 
or criticism or get others interested in a provocative but still 
uncertain finding* 

The upshot, according to Dr. Gary Friedman of the Kaiser 
organization’s Permanente Medical Group: “Much of health 
care is based on tenuous evidence and incomplete knowledge. . 

• . Seemingly authoritative statements and accepted medical 
doctrines, perpetuated through textbook and lectures, often turn 
out to be supported by the most meager of evidence, if any can 
be found." 7 

In general, possible risks tend to be underestimated and 
possible benefits overestimated. For decades surgeons swore 
that only a radical mastectomy was the treatment for breast 
cancer. Only recently were clinical trials mounted to show that 
less drastic treatments seem equally effective. Prefrontal lobot- 
omy, overstrict bed rest, drugs by the carload—medical history 
is rich in treatments that were given for years without question 
or statistically rigorous study, only to be proved wrong and 
discarded. 

Occasionally, unscrupulous investigators falsify their re¬ 
sults. More often, they may wittingly or unwittingly play down 
data that contradict their theories, or they may search out statis¬ 
tical methods that give them the results they want. Before 
ascribing fraud, says Harvard’s Dr. Frederick Mosteller, “keep 
in mind the old saying that most institutions have enough in¬ 
competence to explain almost any results.** 

So some uncertainty almost always prevails. But uncer¬ 
tainty need not stand in the way of good sense. To live—to 
survive on this globe, to maintain our health, to set public 
policy, to govern ourselves—we almost always must act on the 
basis of incomplete or uncertain information. There is a way we 
can do so. 






Somehow the wondrous promise of the earth is that there are things beautiful in 
it, things wondrous and alluring, and by virtue of yout trade, you want to 
understand them. 

— Mitchdl Feigenbaum 
Cornell University physicist *nd mathematician 

The great tragedy of Science—the slaying of a beautiful hypothesis by an ugly 
fact. 

—Thomas Henry Huxley 


To reporters, the world is full of true believers, peddling 
their “truths.’’ The sincerely misguided and the outright fakers 
are often highly convincing, also newsy. How can we tell the 
facts, or the probable facts, from the chaff? 

We can borrow from science. We can try to judge all possi¬ 
ble claims of fact by the same methods and rules of evidence that 
scientists use to derive some reasonable guidance in scores of 
unsettled issues. 

As a start, we can ask these questions: 

How do you know? 

Have the claims been subjected to ary studies or experiments? 

Were the studies acceptable ones, by general agreement? For exam¬ 
ple: Were they without any substantial bias? 

Have results been fairly consistent from study to study? 

Have the findings resulted in a consensus among others in the same 
field? Do at least the majority of informed persons agree? Or should we 
withhold judgment until there is more evidence? 

Always: Are the conclusions backed by believable statistical evidence? 
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And what is the degree of certainty or uncertainty? How sure can you 
be? 


Obviously, much of statistics involves attitude or policy 
rather than numbers. And much, at least much of the statistics 
that reporters can most readily apply, is good sense. 

There are many definitions of statistics as a tool. A few 
useful ones: The science and art of gathering, analyzing, and 
interpreting data; a means of deciding whether an effect is real; 
a way of extracting information from a mass of raw data; a set 
of mathematical processes derived from probability theory. 

Statistics can be manipulated by charlatans, self-dduders, 
and inexpert statisticians. Deciding on the truth of a matter can 
be difficult for the best statisticians, and sometimes no decision is 
possible. Uncertainty will ever rule in some situations and lurk 
in almost all. 

There are rare situations in which no statistics are needed. 
"Edison had it easy? says Dr. Robert Hooke, a statistician and 
author. Tt doesn’t take statistics to see that a light has come on? 1 
It did not take statistics to tell 19th-century physicians that Mor¬ 
ton’s ether anesthesia permitted painless surgery or to tdl 20th- 
century physicians that the first antibiotics cured infections that 
until then had been Highly fatal. 

Overwhelmingly, however, the use of statistics, based on 
probability, is called the soundest method of decision making, 
and the use of large numbers of cases, statistically analyzed, is 
called die only means for determining the unknown cause of 
many events. Birth control pills were tested on several hundred 
women, yet the pills had to be used for several years by millions 
before it became unequivocally dear that some women would 
develop heart attacks or strokes. The pills had to be used for 
some years more before it became dear that the greatest risk 
was to women who smoked and women over 35. 

The best statisticians, let alone practitioners on the firing 
line (for example, physicians), often have trouble deciding when 
a study is adequate or meaningful. Most of us cannot become 


i 
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statisticians, but we can at least learn that there are studies and 
studies, and the unadorned claim “We made a stud/* or “We did 
an experiment" may not mean much. We can learn to ask more 
pointed questions if we understand some basic concepts and 
other facts about scientific studies. 

These are some bedrock statistical concepts: 

• Probability 

• “Power* and numbers 

• Bias and confounders 

• Variability 

Probability 

Scientists cope with uncertainty by measuring probabilities. 
Since all experimental results and all events can be influenced 
by chance and almost nothing is 100 percent certain in science 
and medicine and life, probabilities sensibly describe what has 
happened and should happen in the future under similar condi¬ 
tions. Aristotle said, “The probable is what usually happens," but 
he might have added that the improbable happens more often 
than most of us realize. 

The accepted numerical expression of probability in evalu¬ 
ating scientific and medical studies is the P(or probability) value. 
The P value is one of the most important figures a reporter 
should look for. It is determined by a statistical formula that 
takes into account the numbers of subjects or events being com¬ 
pared in order to answer the question, could a difference or 
result this great or greater have occurred by chance alone? By more 
precise definition, the P value expresses the probability that an 
observed relationship or effect or result could have seemed to 
occur by chance if there had actually been no real effect . A low P value 
means a low probability that this happened, that a medical 
treatment, for example, might have been declared beneficial 
when in truth it was not. 

Here is why the P value is used to evaluate results. A 
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limits or range). This is what happens when a political pollster 
reports that candidate X would now get 50 percent of the vote 
and thereby lead candidate Y by 3 percentage points, “with a 3* 
percentage-point margin of error plus or minus and a 95 per¬ 
cent confidence level” In other words, Mr. or Ms. Pollster is 95 
percent confident that X’s share of the vote would be someplace 
between 53 and 47 percent. Similarly, candidate Vs share might 
be 3 percentage points greater (or less) than the figure predicted. 
In a dose election, that margin of error could obviously turn a 
predicted defeat into victory. And that sometimes happens. 

An important point in looking at the results of poliucal polls 
(and any other statements of confidence): In the reports we 
read, the plus or minus 3 (or whatever) percentage points is 
often omitted, and the pollster merely mentions a *3-point 
margin of error” This means there is actua&y a 6-point range 
within which the truth probably lurks. 

The more people who are questioned in a poliucal poll or 
the larger the number of subjects in a medical study, the greater 
the chance of a high confidence level and a narrow, and there¬ 
fore more reassuring, confidence interval. 

No matter how reassuring they sound, P values and confi¬ 
dence statements cannot be taken as gospel, for .05 is not a 
guarantee, just a number. There are several important reasons 
for this. 

• AD that lvalues measure is the probability that the results 
might have been produced by some sneaky random process. In 
20 results where only chance is at work, 1, on the average, will 
have a reassuring-sounding but misleading P value of < .05. 
One, in short, may be a false positive. 

Dr. Marvin Zden points out that there may be 6,000 to 
10,000 clinical (medical) trials of cancer treatment under way 
today, and if the conventional value of .05 is adopted as the 
upper permissible limit for false positives, then every 100 studies 
with no actual benefit may, on average, produce 5 false-positive 
results. Hence, we may expea 50 false positive results, on 
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red blood count (say, 0.1 g/100 mL, or a tenth of a gram per 
100 milliliters), may be statistically significant yet medically 
meaningless. 4 

• Eager scientists can consciously or unconsciously manip¬ 
ulate the P value by failing to adjust for other factors, by choos¬ 
ing to compare different end points in a study (say, condition on 
leaving the hospital rather than length of survival), or by choos¬ 
ing the way the P value is calculated or reported. 

There are several mathematical paths to a P value, such as 
the chi-square (x 1 ), t, F, r, and paired t tests. All may be legiti¬ 
mate. But be warned. Dr. David Salsburg of Pfizer, Inc., has 
written in the American Statistician of the unscrupulous practi¬ 
tioner who “engages in a ritual known as *hunting for P values’ * 
and finds ways to modify the original data to “produce a rich 
collection of small P values' even if those that result from simply 
comparing two treatments “never reach the magical .05.’* 

“If you look hard enough through your data,” contributes 
an investigator at a major medical center, “if you do enough 
subset analyses, if you go through 20 subsets, you can find 
one”—say, “the effect of chemotherapy on premenopausal 
women with two to five lymph nodes”—“with a P value less than 
.05. And people do this * 

“Statistical tests provide a basis for probability statements," 
writes Dr. John Bailar, “only when the hypothesis is fully devel¬ 
oped before the data are examined. ... If even the briefest 
glance at a study’s results moves the investigator to consider a 
hypothesis not formulated before the study was started, that 
glance destroys the probability value of the evidence at hand.” 
(At the same time, Bailar adds, “review of data for unexpected 
dues . . . can be an immensely fruitful source of ideas” for new 
hypotheses “that can be tested in the correct way” And occa¬ 
sionally “findings may be so striking that independent confirma¬ 
tion ... is superfluous.”)* 

A rather sophisticated—and possibly touchy—line of ques¬ 
tioning that some reporters might want to try if they’re skeptical: 
How did you arrive at your P value? Did you use the test planned in 
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children of this age group, we would expect only 3 cases in 100 
years. But in this nation with thousands of schools, we would 
occasionally*—such is chance—find schools with 3 or more cases 
in a single year. Then one is faced with the problem of interpre¬ 
tation,* Zelen says. “Is this one of those rare events that is surely 
going to be observed? Or is it due to some causal factor?* 

A reporter in this instance might ask a statistician at the 
National Cancer Institute or a medical center. What is the 
chance of such an event in such a population? How many 
similar unusual events are probably never reported? 


“Power* and Numbers 


This gets us to another statistical concept: power. Statisti¬ 
cally, “power* means the probability of finding something if it’s 
there. Example: Given that there is a true effect, say a difference 
between two medical treatments or an increase in cancer caused 
by a toxin in a group of workers, how likely are we to find it? 

Sample size confers power. Statisticians say, “Funny things 
can happen in small samples without meaning very much* . . . 
There is no probability until the sample size is there" . . . 
“Large numbers confer power* . . . “Large numbers at least 
make us sit up and take notice."* 

All this concern about sample size can also be expressed as 
the law of large numbers , which says that as the number of cases 
increases, the probable truth of a conclusion or forecast in¬ 
creases. The validity (truth or accuracy) and reliability (reproduci¬ 
bility) of the statistics begin to converge on the truth. 

We already learned this when we talked about probability. 


"There is another unrelated use of the word "power" Saemisu commonly speak of 
increasing or "raising" some quantity by m power tf 2 or 3 or 100 or whatever. "Power" 
here means the product you get when you multiply a number by itself one or more 
times. Thus, in2x2«4, 4isthe second power of 2, or to put it another way, there 
are two 7s in your equation. This is commonly written 2* and known as 2 to the second 
power or just 2 to the second. In 2 X 2 X 2 * 8, 2 has been raised to the third power. 
When you think about 2***, you see the need for the shorthand. 
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page 33] for a number like that is 100—that is, the square root 
of the original number. That means the number may vary by a 
minimum of 200 every year without even considering growth, 
the business cycle, or any other effect. This will supplement 
your reporters approach" 

Looking for error in reported results, statisticians try to spot 
both false positives and false negatives. The false positive (or Type 
1 or alpha error in statistical language you may see) is to find a 
result or effect where there is none. The false negative (or Type II 
or beta error) is to miss an effect where there is one. The latter is 
particularly common when there are small numbers. There are 
some very well conducted studies with small numbers, even five 
patients, in which the results are so dear-cut that you don’t have 
to worry about power" says Dr. Reiman. "You still have to 
worry about applicability to a larger population, but you don’t 
have to doubt that there was an effect. When results are nega¬ 
tive, however, you have to ask, How large would the effect have 
to be to be discovered?” 

Many scientific and medical studies are underpowered— 
that is, they indude too few cases. "Whenever you see a negative 
result" another sdentist says, "you should ask, What is the 
power? What was the chance of finding the result if there was 
one?" One study found that an astonishing 70 percent of 71 
well-regarded dinical trials that reported no effect had too few 
patients to show a 25 percent difference in outcome. Half of the 
trials could not have detected a 50 percent difference* 

A statistician scanned an artide on colon cancer in a lead¬ 
ing journal. "If you read the artide carefully,” he said, "you will 
see that if one treatment was better than the other—if it would 
increase median survival by 50 percent, from five to seven and a 
half years, say—they had only a 60 percent chance of finding it 
out. That’s little better than tossing a coin!” 

The weak power of that study would be expressed numeri¬ 
cally as .6, or GO percent. Scan an article’s fine print or foot¬ 
notes, and you will sometimes find such a power statement. Most 
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What are your numbers? After all, some researchers reportedly 
announced a new treatment for a disease of chickens by saying, 

*33.3 percent were cured, 33.3 percent died, and the other one 
got away? 

Bias and Confounders 

One scientist once said that lefties are overrepr ese nted 
among baseball’s heavy hitters. He saw this as “a possible result 
of their hemispheric lateralization, the relative roles of the two 
sides of the brain? A critic who had seen more ball games said 
some simpler covariables could explain the difference. When 
they swing, left-handed hitters are already on the move toward 
first base. And most pitchers are right-handers who throw most 
often to right-handed hitters. 11 

Scientist A was apparently guilty of bias, meaning the intro¬ 
duction of spurious associations and error by failing to consider 
other influential factors. The other factors may be called covaria¬ 
bles, covariates, intervening or contributing variables, confounding varia¬ 
bles, or confounders. A simpler term may be “other explanations? 

Statisticians call bias “the most serious and pervasive prob¬ 
lem in the interpretation of data from clinical trials" . • . “the 
central issue of epidemiological research" . . . “the most com¬ 
mon cause of unreliable data? Able and conscientious scientists 
try to eliminate biases or account for them in some way. But not 
everybody who makes a scientific, medical, or environmental 
claim is that skilled. Or that honest. Or that all-powerful. Some 
biases are unavoidable by the very difficulty of much research, 
and the most insidious biases of all, says one statistician, are 
“those we don’t know exist? 

Some biases may be uncovered by assiduous investigation. 

A father noticed that every time one of his 11 kids dropped a 
piece of bread on the floor, it landed with the buttered side up. 

“This utterly defies the laws of chance" he exclaimed. Close 
examination disclosed the cause: The kids were buttering their 
bread on both sides. 
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I told this story to one statistician, who said, “I was once 
called about a person who had won first, second, and third 
prizes in a church lottery. I was asked to assess the probability 
that this could have happened. I found out that the winner had 
bought nearly all the tickets* 

He had of course asked the obvious question for both scien¬ 
tist and reporters: Could the relationship described be explained by other 
factors? 

Not everyone will tell you, of course, for bias is a pervasive 
human failing. As one candid scientist is said to have admitted, 
“I wouldn’t have seen it if I hadn’t believed it.” Enthusiastic 
investigators often tell us their findings are exciting. But they 
may be so exciting that the investigators paint the results in 
over-rosy hues. 

Other powerful human drives—the race for academic pro¬ 
motion and prestige, financial connections—can also create con¬ 
scious or unconscious conflicts of interest or attitudes that feed 
bias. Dr. Thomas Chalmers of Mount Sinai Medical Center in 
New \brk tells of a drug trial, financed by a pharmaceutical 
firm, in which both the head of the study committee and the 
main statisticians and analysts were the firm’s employees, 
though not so identified in any credits. He tells of a study of oral 
drugs for diabetes in which the fact that the first author had 
previously published 14 articles on the subject, and in 7 had 
acknowledged support by the drug manufacturers, was “not 
known to the reader” 

In contrast, Chalmers describes a study also financed by a 
drug firm but with a contract specifying a study protocol de¬ 
signed by independent investigators and monitored by an out¬ 
side board less likely to be influenced by a desire for a favorable 
outcome. “It is never possible to eliminate” potential conflicts of 
interest in biomedical research, he concludes, but they should be 
disclosed so others can evaluate them. 13 

Even a genius may be biased. Horace Freeland Judson of 
Johns Hopkins University tells how Isaac Newton experimented 
with prisms and lenses and developed a theory of color, light, 
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and the solar spectrum. He did not report seeing some dark 
lines—absorption lines, which mark varying wavelengths—that 
his instruments must have shown. A modem scientist argues 
that Newton’s theory, not his instruments, had no place for that 
evidence: *To the observing scientist, hypothesis is both friend 
and enemy? 14 

For years technicians making blood counts were guided by 
textbooks that told them two or more “properly* studied samples 
from the same blood should not vary beyond narrow “allowable” 
limits. Reported counts always stayed inside those limits. A 
Mayo Clinic statistician rechecked and found that at least two 
thirds of the time the discrepancies exceeded the supposed 
limits. The technicians had been seeing what they had been told 
to expect and discounting any differences as mistakes. This also 
saved them from the additional labor of doing still more count¬ 
ing. 

Both the (nosed observer and the biased subject are common in 
medicine. A researcher who wants to see a treatment result may 
see one. A patient may report one out of eagerness to please the 
researcher. There is also the powerful placebo effect . Summarizing 
many studies, one scientist found that half the patients with 
headaches or seasickness—and a third of those suffering from 
coughs, mood changes, anxiety, the common cold, and even the 
disabling chest pains of angina pectoris—reported relief from a 
“nothing pill.” 1 ® A placebo is not truly a nothing pill; the mere 
expectation of relief seems to trigger important effects within the 
body. But in a careful study the placebo should not do as well as 
a test medication; otherwise the test medication is no better than 
a placebo. 

Sampling bias is the bugaboo of both political polls and medi¬ 
cal studies. Say you want to know what proportion of the popu¬ 
lace has heart disease, so you stand on a comer and ask people 
as they pass. Ybur sample is biased, if only because it leaves out 
those too disabled to get around. Your problem, a statistician 
would say, is selection . A political pollster who fails to build a valid 
probability sample, easy when questioning only a thousand or 
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logical or environmental study), there were probably many 
dropouts. A well-conducted study should describe and account 
for them. A study that does not may report a favorable treat¬ 
ment result by ignoring the fate of the dropouts—a confounding 
variable. 

Age, gender, occupation, nationality, race, income, so¬ 
cioeconomic status, health status, and powerful behaviors like 
smoking are all possible confounding—and frequently ig¬ 
nored—variables. In the 1970s, foes of adding fluoride to city 
water pointed to crude cancer mortality rates in two groups of 
10 U.S. cities. One group had added fluoride to water, the other 
had not, and from 1950 to 1970 the cancer mortality rate rose 
faster in the fluoridated cities. The National Cancer Institute 
pointed out that the two groups were not equal: The difference 
in cancer deaths was almost entirely explained by differences in 
age, race, and sex. The age-, race-, and sex-adjusted difference 
actually showed a small, unexplained lower mortality rate in the 
fluoridated cities. 17 

If you look carefully at the fate of women taking birth 
control pills, you find that advancing age and smoking are the 
two great confounders. You must take both into account to find 
the greatest clusters of ill effects. Smoking has been an important 
confounder in studies of industrial contaminants like asbestos, 
in which, again, the smokers suffer a disproportionate number 
of ill effects. 1 * 

A 1947 survey of Chicago lawyers showed that those who 
had mere high school diplomas before entering legal training 
earned 6.3 percent more, on the average, than college gradu¬ 
ates. The confounder here—the real explanation—was age. In 
1947 there were still many older lawyers without college de¬ 
grees, and they were simply older, on the average, and hence 
more established. 19 

Occupational studies often confront another seeming para¬ 
dox: The workers exposed to some possible adverse effect turn 
out to be healthier than a control group of persons without such 
exposure. The confounder: the well-known heedihy-worker effect. 
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Workers tend to be healthier and live longer than the population 
in general. 

Some studies of workers in steel mills showed no overall 
increase in cancer, despite possible exposures to various carcino¬ 
gens. It took a look at black workers alone to find excess cancer. 
They commonly worked at the coke ovens, where carcinogens 
were emitted. This was a case where the population had to be 
stratified, or broken up in some meaningful way, to find the facts. 
Such findings in blacks often may be falsely ascribed to race or 
genetics, when the real or at least the most important contribut¬ 
ing or ruling variables—’to a statistician, the independent varia¬ 
bles occupation and the social and economic plights that 
put blacks in vulnerable settings. The excess cancer is the depend¬ 
ent variable ; the result. 

*In a two-variable relationship,* Dr. Gary Friedman ex¬ 
plains, “one is usually considered the independent variable, 
which affects the other or dependent variable." 20 Take the fact 
that more people get colds in winter. Here weather is commonly 
seen as the underlying or independent variable, which affects 
incidence of the common cold, the dependent variable. Actually, 
of course, some people, like children in school who are con¬ 
stantly exposed to new viruses, are more vulnerable to colds 
than others. In the case of these children, then, as in the case of 
the black workers at the coke ovens, there is often more than 
one independent variable. Also, some people think that an im¬ 
portant underlying reason for the prevalence of colds in winter 
may be that children are congregated in school, giving colds to 
each other, thence to their families, thence to their families’ 
coworkers, thence to the coworkers’ families, and so on. But 
cold weather—and home heating?—may still figure, perhaps by 
drying nasal passages and making them more vulnerable to 
viruses. 

The search for true variables is obviously one of the main 
pursuits of the epidemiologist, or disease detective—or of any 
physician who wants to know what has affected a patient, or of 
any student of society who seeks true causes. Like colds, many 
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medical conditions, such as heart disease, cancer, and probably 
mental illness, have multiple contributing factors. Where many 
known, measurable factors are involved, statisticians can use 
mathematical techniques—the terms you will see include multiple 
regression, multivariate analysis, and discriminant analysis and factor, 
cluster, path, and two-stage least-squares analyses—xo relate all the 
variables and try to find which are the truly important predic¬ 
tors. Yet some situations, like the striking decline in U-S. heart 
disease mortality in recent years, defy such analyses. These 
years have seen several major changes in American life that 
may play a role: less smoking among men, consumption of a 
leaner diet, more recreational exercise (though more sedentary 
work). Medical care is far better, including the treatment of 
hypertension, which disposes people to heart disease. Many of 
these variables cannot be well measured, and the effect of some 
is debatable, so—a common situation in science—the truth re¬ 
mains uncertain. 

Variability 

Doctors always say, “Most things are better in the morning,” 
and the/re mostly right. Most chronic or recurring conditions 
wax and wane. We tend to wake up at night when the condition 
is at its worst. Then, no matter what is done by way of treat¬ 
ment the next day, die odds are that well feel better. 

This is regression toward the mean: the tendency of all values in 
every field of science—physical, biological, social, and eco¬ 
nomic—to move toward the average. Tall fathers tend to have 
shorter sons, and short fathers, taller sons. The students who get 
the highest grades on an exam tend to get somewhat lower ones 
the next time. The regression effect is common to all repeated 
measurements. 

Regression is part of an even more basic phenomenon: 
variation, or variability. Virtually everything that is measured var¬ 
ies from measurement to measurement. When repeated, every 
experiment has at least slightly different results. Take a patients 
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blood pressure, pulse rate, or blood count several times in a 
row, and the readings will be somewhat different, lake them at 
different times of day or on different days; and the readings may 
vary greatly. 

The important reasons? In part, fluctuating physiology, but 
also measurement errors, die limits of measurement accuracy, 
and observer variation. Examining the same patient, no two 
doctors will report exactly the same results, and the results may 
be grossly different. If six doctors examine a patient with a faint 
heart murmer, only one or two may have the skill or keen 
hearing to detect it. Experimental results so typically differ from 
one time to the next that scientific and medical fakers—a Boston 
cancer researcher, for example—have been detected by the un¬ 
usual regularity of their reported results, with numbers agreeing 
too well and the same results appearing time after time, with not 
enough variation from patient to patient. 

Biological variation is the most important cause of variation in 
physiology and medicine. Different patients, and the same pa¬ 
tients, react differently to the same treatment. Disease rates 
differ in different parts of the country and among different popu¬ 
lations, and—alas, nothing is simple—there is natural variation 
within the same population. 

Every population, after all, is a collection of individuals, 
each with many characteristics. Each characteristic, or variable, 
such as height, has a distribution of values from person to person, 
and—if we would know something about the whole popula¬ 
tion—we must have some handy summaries of the distribution. 
We can’t get much out of a list of 10,000 measurements, so we 
need single values that summarize many measurements. 

Enter here the familiar average or, more exactly, the mean, 
median, and mode. These and a few other measures can give us 
some idea of the look of the whole and its many measurable 
properties, or parameters ; 

When most of us speak of an average, we mean simply the 
mean or arithmetic average, the sum of all the values divided by the 
number of values. The mean is no mean tool; it is a good way 
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to get a typical number, but it has limitations, especially when 
there are some extreme values. There is said to be a memorial 
in a Siberian town to a fictitious Count Smerdlovski, the world's 
champion at Russian roulette. On the average he won, but his 
actual record was 73 and l. 11 

If you look at the average salary in a hospital, you will not 
know that half the personnel may be working for the minimum 
wage, while a few hundred persons make $100,000 or more a 
year. You may learn more here from the median, the figure that 
divides a population into two equal halves. The median can be 
of value when a group has a few members with extreme values, 
like the 400-pounder at an obesity dinic whose other patients 
weigh from 180 to 200 pounds. If he leaves, the patients’ mean 
weight might drop by 10 pounds, but the median might drop 
just 1 pound.* 2 

The most frequently occurring number or value in a distri¬ 
bution is called the mode. When the median and the mode are 
about the same, or even more when mean, median, and mode 
are roughly equal, you can feel comfortable about knowing the 
typical value. 

You still need to know something about the exceptions, in 
short, the dispersion (or spread or scatter) of the entire distribu¬ 
tion. One measure of spread is the range. It tells you the lowest 
and highest values. It might inform you, for example, that the 
salaries in that hospital range from $10,000 to $250,000. 

You can also divide your values into 100 percentiles , so you 
can say someone or something falls into the 10th or 7lst per¬ 
centile, or into quartiles (fourths) or quintiles (fifths). One useful 
measure is the interquartile range , the interval between the 75th 
and 25th percentiles—this is the distribution in the middle, 
which avoids the extreme values at each end. Or you can divide 
a distribution into subgroups—those with incomes from $10,000 
to $20,000, for example, or ages 20 to 29, 30 to 39, and so on. 

All these values can easily be plotted. With many of the 
dungs that scientists, economists, or others measure—IQs, for 
example, and other test scores—we typically tend to see a famil¬ 


iar, bell-shaped 
end, or tail. T 
19th-century ( 
But you may 
clusters, a him 
A widely 
great deal. Nc 


i 


tance from th 
range, this ha 
how spread o 
In what one s 
in most sets ■ 
being measur 
average by n 
more than 2 
than 2.57 sta 
‘Once yc 
shaped distril 
the who 1 - 
curve 

variation ol 
the more spr 


* 


•Then- '» n 
depending on iht 
diflervnort beiwvt 
number of qwnt 
of a population r- 
n>uit An in 


Sometime* 
mean, being an 
iiandard mm or u 


ro 

ert 

o 

CO 

o 

ro 

cn 

o 


All the aim 










Source: https://www.industrydocuments.ucsf.edu/docs/lnbjOOOO 




34 


CHAPTER 


Example: If the average score of all students who take the 
SAT college entrance test is relatively low and the spread—the 
standard deviation—relatively large, this creates a very long- 
tailed, low-humped curve of test scores, ranging, say, from 
around 300 to 1500. But if the average score of a group of 
brighter students entering an elite college is high, the standard 
deviation of the scores will be less and the curve will be high¬ 
humped and short-tailed, going from maybe 900 to 1500. 

Tf I just told you the means of two such distributions, you 
might say they were the same,* another scientist says. “But if I 
reported the means and the standard deviations, you’d know 
they were different, with a lot more variations in one." 

From a human standpoint, variation tells us that it takes 
more than averages to describe individuals. Biologist Stephen 
Jay Gould learned in 1982 that he had a serious form of cancer. 
The literature told him the median survival was only eight 
months after discovery. Three years later he wrote in Discover, 
“All evolutionary biologists know that means and medians are 
the abstractions,* while variation is “the reality? meaning “half 
the people will live longer* than eight months. 

Since he was young, since his disease had been diagnosed 
early, and since he would receive the best possible treaunent, he 
decided he had a good chance of being at the far end of the 
curve. He calculated that the curve must be skewed well to the 
right, as the left half of the distribution had to be “scrunched up 
between zero and eight months, but the upper right half [could] 
extend out for years." He concluded, “I saw no reason why I 
shouldn’t be in that small tail. ... I would have time to think, 
to plan and to fight.* Also, since he was being placed on an 
experimental new treatment, he might if fortune smiled “be in 
the first cohort of a new distribution with ... a right tail ex¬ 
tending to death by natural causes at advanced old age." 13 

Statistics cannot tell us whether fortune will smile, only that 
mch reasoning is sound. 
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brought forth for on-camera testimonials. Except for some 
newspapers that decided to print nothing, the story flew far and 
wide. 

The head investigator, a chief resident in neurosurgery, 
cautioned that the results, though encouraging, were Very 
tally* and “certainly do not prove this is an effective treatment* 
He advised healthy skepticism. But headlines unequivocally 
read: “Alzheimer's Test Found Successful,* “Alzheimer's: A New 
Promise* “First Breakthrough Against Alzheimer's," “Pump Of¬ 
fers Hope,* “Possible Alzheimer's Cure." 

Within two months the medical center logged 2,600 phone 
cadis, mainly from desperate families, and critics began asking 
why a press conference had been held, since a study of only four 
patients—with unblinded investigators getting their assessments 
from hopeful families—meant little. 

Harvard’s Dr. Jay Winsten concluded that “the decision to 
hold a press conference ... for outweighed in impact the mod¬ 
ulating effect of the investigators’ qualifying language. The vis¬ 
ual impact of [one] patient’s on-camera testimonials all but 
guaranteed that TV coverage would oversell the research, de¬ 
spite any qualifying language." 1 

When dubious claims are made—about Alzheimer's, a new 
cancer drug, a possible AIDS cure—and the claims get widely 
reported, there is commonly a lot of postmortem clucking and 
soul-searching among reporters and editors. Then someone else 
makes some sensational claim, and the same thing may happen 
all over again. 

The biggest error in medical science, according to Dr. 
Thomas Chalmers, is “the uncontrolled pilot study in which the 
investigators try a treatment on 10 patients, and if it seems to 
work . . . are tempted to report it* to fellow scientists, let alone 
the media. 2 

Afl science is only a stab at the truth. Even with the best of 
statistics, “We scientists don’t know how to tell the whole truth" 
Mosteller reminds us. 2 Outside this honest limitation lie vast 
realms of inadequate science with plausible-sounding yet shaky 
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statistics. A French physician, Pierre Charles Alexandre Louis, 
said 150 years ago. The only reproach which can be made to 
the numerical method* is that it “requires much more labor and 
time than the most distinguished members of our profession* 
often give it. “Some days* says one modem statistician, “I think 
every idiot in the country who can put his hands on a computer 
program thinks he’s a statistician.* 

The big problems of statistics, say its best practitioners, 
have little to do with computations and formulas. They have to 
do with judgment, we’re told, with how to design a study, how 
to conduct it, then analyse and interpret the results. In a day of 
frenzied media competition for the public’s eye and ear—and 
many chances to do harm by shaky reporting—journalism too 
calls for sophisticated judgment. How, then, can we have some 
hope of telling which studies seem credible, which we should 
report? 

A fundamental principle is that every conscientiously con¬ 
ducted study has a careful design: a method or plan of attack to 
include the right kind and number of patients or petri dishes 
and to try to eliminate bias. Different problems require different 
methods, and one of the most basic questions in science is. Can 
this kind of experiment, this design , yield the answer? 

This is not a simple question for a reporter to answer, but 
there is much we can know. What kinds of studies, what kinds 
of numbers and controls and methods, should we look for? 

Experiments versus Seductive Anecdotes 

Students and eggs can be graded, citizens and cities can be 
credit-rated, and scientific evidence can be weighed according to 
what has been called a hierarchy of evidence. Some kinds of 
studies carry little weight, some more, some a great deal. 

Science and medicine started with anecdotes , unreliable as far 
as generalization is concerned, yet provocative. Anecdotes ma¬ 
tured into systematic observation, the most ancient form of 
science. Observation told the ancients much about the stars, it 
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told the pharaohs’ physicians much about the sick, and it is still 
important, for simple “eyeballing* has developed into data collec¬ 
tion and the recording of case histories. These are respectable, yea, 
indispensable methods yet still only one part of science. Case 
histories may not be typical, or they may reflect the beholder. 
Methane continues to be plagued by Big Authorities who insist, 
T know what I see" 

There can be useful, even inspired, observation and analy¬ 
sis of natural experiments. Excess fluoride in some waters hardened 
teeth, and this observation led to fluoridation of drinking water 
to prevent tooth decay. There are also man’s inadvertent experi¬ 
ments, disastrous and benign, to be studied. Hiroshima trig¬ 
gered wide analysis of the effects of nuclear radiation, invaluable 
yet frustrating because there were no good measures of exposure 
levels, a gap that has caused confusion and controversy ever 
since. 

In 1585 or so, Galileo dropped those weights from a tower 
and helped invent the scientific experiment: a study in which the 
experimenter controls the conditions—controlled conditions are 
the heart of the experimental method—and records the effect. 
Experiments on objects, animals, germs, and people matured 
into the modem experimental study ; in which the experimenter 
typically changes only one or some other planned number of 
variables to see the outcome. 

Clinical Trials 

The experimental method is the essence of experimental 
medicine’s current “gold standard": 4 the controlled\ randomized clini¬ 
cal trial. At its best, the investigator tests a treatment or drug or 
some other intervention by randomly selecting at least two com¬ 
parable groups, the experimental group that is tested or treated and 
a control group that is observed for comparison. 

True clinical trials are expensive and difficult. It has been 
estimated that of 100 scheduled trials, 60 are abandoned, not 
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controls are often misleading—the groups compared are fre¬ 
quently not comparable, the treatments may have been given 
by different methods—but they are still at times useful. 

What Makes a Study Honest? 

Obviously, all studies, including the best, have potential 
pitfalls: 

• Lack of adequate controls is fatal if you really want to put the 
results in the bank. 

• The group or sample studied\ 10 people or 10,000, must be 
large enough to get a valid result and representative enough to 
apply to a larger population. Because people vary so widely in 
their reactions, and a few patients can fool you, fair-sized groups 
of patients are usually needed. And enough of the right kind of 
subjects are needed for a suitable sample. Picking patients for a 
medical study is no different from picking citizens to be ques¬ 
tioned in a political poll. In both, a sample is studied, and 
inferences—the outcome of an election, the results in patients in 
general—are made for a larger population. 

To get a large enough sample, medical researchers more 
and more try to conduct multicenter trials, which are appealing 
because they can indude hundreds of patients, but expensive 
and tricky because one must try to maintain similar patient 
sdection and quality control at 10 or 100 institutions. Successful 
multicenter trials established the value of controlling hyperten¬ 
sion to prevent strokes. They demonstrated the strong probabil¬ 
ity that less extensive surgery is as effective as more drastic 
surgery for many breast cancers. 

• The sample should be randomized — divided by some random 
method into comparable experimental and control groups. Ran¬ 
domization can easily be violated. A doctor assigning patients to 
treatment A or B may, seeing a particular type of patient, say or 
think, This patient will be better on B* 

If treatment B has been established as better than A, there 
should be no random study in the first place and certainly no 
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study of that doctor's patient. When randomization is violated, 
“the triaTs guarantee of lack of bias goes down the drain,” says 
one critique. As a result, patients who consent to randomization 
are often assigned to study groups according to a list of com* 
puter-generated random numbers. 

• To combat War—the influence of confounding variables— 
and get answers applicable to various populations, the sample or 
study population must often be stratified, or separated into 
groups by age, sex, socioeconomic status, and so on. Failure to 
stratify can hide true associations. The role of high-absorbency 
tampons in toxic shod: syndrome was clarified only when the 
cases were broken down by precise type of tampon used. 

The identification of important subcategories of patients 
can be tricky indeed. A study of open-heart surgery patients 
may fail to separate out those who had to wait for their surgery. 
But some patients die waiting, and those left are relatively 
stronger patients who do better, on the average, than those 
treated immediately after diagnosis. 

Wit reporters may also fail to pay attention to stratification, 
or distribution. In early 1985 the President’s Council of Eco¬ 
nomic Advisers reported that—to quote the page-one lead in a 
major newspaper—‘‘elderly Americans have achieved economic 
parity with the rest of the population and no longer ate a disad¬ 
vantaged group.” Not for several paragraphs, now on an inside 
page, did the story note that “there’s a lot of variability? and 
older people are also “more likely ... to have members with 
incomes below die average of their age group.”* In short, there 
are still many elderly trapped in poverty. 

• To combat bias in investigators or patients, studies should be 
blinded —to the extent feasible, singledouble -, or, best of all, triple- 
blinded, so that neither the doctors nor the nurses administering 
a treatment nor the patients nor those who assess the results 
know whether today’s pill is treatment A, treatment B, or an 
ineffective placebo. Otherwise, a doctor or patient who yearns for 
a good result may see or fed one when the ‘right” drug is given. 
There is a tale of an overzealous receptionist who, knowing 
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which patients were getting the meal drug and not the placebo, 
was so encouraging to these patients that they began saying they 
felt good, willy-nilly* 

Barring observant receptionists, the use of a placebo—from 
the Latin meaning "I shall please*—may help maintain blind¬ 
ness. Placebos actually give some relief in a third of all patients, 
on the average, in various conditions. The effect is usually tem¬ 
porary, however, and a truly effective drug ought to work sub¬ 
stantially better than the placebo. 

Blinding is often impossible or unwise. Some treatments 
don’t lend themselves to it, and some drugs quickly reveal them¬ 
selves by various effects. But an unblinded test is a weaker test. 

• Finally, what makes a study honest is honesty. John Bailar 
warns of deliberate or careless deceptions that seem to be uni¬ 
versally accepted today, practices that sometimes have much 
value but at other times are “inappropriate and improper and, 
to the extent that they are deceptive, unethical.” Among them: 
the selective reporting of findings, leaving out some that might 
not fit the conclusion; the reporting of a single study in multiple 
fragments, when the whole might not sound so good; and the 
failure to report the low power of some studies, their inability to 
detect a result even if one existed. 7 

Dr. Charles Moertel of the Mayo Clinic says, 

Probably the majority of cancer patients treated with chemotherapy 
today are receiving regimens that have not been proved effective by 
randomized trial. . . . Many articles published in our major journals 
make claims for fantastic therapeutic accomplishments with no ran¬ 
domized controls. . . . Many, if not most, of the randomized studies . 

. . are of such poor quality that their results are unbelievable. . . . 
Precious few have withstood the scrutiny of carefully designed 
confirmatory scientific study. 

He calls a multitude of poor methods statistical legerde¬ 
main: “the games we play, trying to squeeze out that little bit of 
breakthrough." Why the pressure to play them? “Salvation," Dr. 
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David SaJsburg answers. “Fruit in this world (increases in salary; 
prestige, invitations to speak) and beyond this life (continual 
references in the citation index)."® 

Epidemiology: Hippocrates to AIDS 

Clinical studies deal with patients. Epidemiology deals with 
populations, which sometimes are large groups of patients. Epi¬ 
demiology seeks the causes of both health and disease by placing 
a population under its own kind of microscope, the epidemiologi¬ 
cal investigation. 

Epidemiological studies in many ways parallel clinical stud¬ 
ies—some studies are both—and are subject to many of the 
same pitfalls and rules, like avoiding bias and stratifying to get 
the right answers about the right subgroups. An old saw, in fact, 
goes, an epidemiologist is a physician broken down by age and 
sex. 

Epidemiology in its early days was concerned wholly with 
epidemics of typhoid, smallpox, and other infections. But epide¬ 
miologists today also ask, “What should we eat and how should 
we live to stay healthy?" and they study large groups to see how 
the healthiest and unhealthiest live. Hippocrates has been called 
the first environmentalist because he observed that it was 
healthier to live in high places than in low ones. Anticipating 
today’s environmentalists, he blamed bad air and bad water and 
may have been partly right. But he failed to stratify; otherwise 
he might have noticed that the people who lived high were also 
wealthier and better nourished than those who lived low.® 

In 1740 Percival Pott scored a famous epidemiological 
success by observing the high rate of scrotum cancer in Lon¬ 
don’s chimney sweeps and correctly blaming it on their exposure 
to soot—burned organic material, much like a smoked ciga¬ 
rette. A century later, John Snow, plotting London cholera 
cases on a map and noting a duster around one source of 
drinking water, removed the handle from the now famed Broad 
Street pump and helped end a deadly epidemic. The 19th- 
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century French advocate of statistical methods, Pierre Louis, 
observed hospital patients and helped stop the use of bleeding as 
a treatment. Ignaz Semmelweis showed that doctors 9 dirty 
hands transmitted deadly childbed fever to mothers. 

Modem epidemiologists successfully indicted smoking as a 
cause of lung cancer and heart disease and identified the associa¬ 
tion of fats and cholesterol with dogging of the arteries. They 
evaluate vaccines, assess new methods of health care delivery, 
and track down the causes of new scourges like AIDS, toxic 
shock syndrome, and Legionnaires 9 disease, all by several 
methods. All are valuable. All are full of traps. 

• Epidemiology, like all of sdence, started with observational 
studies, and these remain important. They are weak and uncer¬ 
tain, we have noted, when it comes to determining cause and 
effect. Yet observation is how we first learned of the unfortunate 
effects of toxic rain, Agent Orange, dgarette smoking, and 
many sometimes helpful, sometimes harmful medications—and 
of certain sexual practices and addicts* use of dirty needles on 
AIDS. 

• Some observational studies are simply descriptive —describ¬ 
ing the incidence, prevalence, and mortality rates of various 
diseases, for example. Other, analytic studies seek to analyze or 
explain: the Seven-Country Study, for example, that helped 
associate high meat and dairy fat and cholesterol consumption 
with excess risk of coronary heart disease. Ecological studies look 
for links between environmental conditions and illness. Human 
migrations—like that of the Japanese who come to the United 
States, eat more fat, and get more disease than they did in 
Japan—are among valuable natural experiments. 

• The simplest observational measurement is a count. Sam¬ 
pling is just a more sophisticated kind of count. You can’t count 
or question everybody, so you seek a sample that represents the 
whole. Many epidemiological surveys rely cm samples—among 
them, government surveys of health and nutritional habits. 
Samples and surveys often use questionnaires to get information. 

A sample or survey is never more than a snapshot of the 
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smoking to lung cancer, the association of birth control pills with 
blood vessel problems, and the transmission patterns of AIDS 
were identified in case-control studies that pointed to the need 
for broader investigation. 

Cohort or incidence studies are motion pictures. They pick a 
group of people, or cohort —a cohort was a unit of a Roman 
legion—often stratify or divide them into subgroups, then follow 
them over time, often for years, to see how some disease or 
diseases develop. These studies are costly and difficult. Subjects 
drop out or disappear. Large numbers must be studied to see 
rare events. But cohort studies can be powerful instruments and 
substitutes for randomized experiments that would be ethically 
impossible. You can't ethically expose a group to an agent that 
you suspect would cause a disease. You can watch a group so 
exposed. 

The noted Framingham study of ways of life that might be 
associated with developing heart disease has followed more than 
5,000 residents of that Massachusetts town since 1948. The 
American Cancer Soriet/s 1952-55 study of 187,783 men aged 
50 to 69, with 11,780 of them dying during that period, did 
much to establish that cigarette smoking was strongly associated 
with developing lung cancer. 10 

• Many epidemiological, as well as clinical, studies are 
handicapped because they must be retrospective. They look back 
in time—at medical records, vital statistics, or people’s recollec¬ 
tions (for example, those collected in interviews in a case-control 
study). People who have a disease are questioned to try to find 
common habits or exposures. Women with cervical cancer are 
interviewed to see how many took possibly guilty hormones and 
how many did not. People who live around a Love Canal are 
asked if they have been 01. 

Retrospective studies are notoriously unreliable. Memories 
fail or play tricks. Old records are poor and misleading. Defini¬ 
tions of diseases and methods of diagnosis vary sharply over the 
years. The patients you find may not be representative. A retro¬ 
spective study, however intriguing, generally only says that 
there may be something here that ought to be investigated. 
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QUESTIONS REPO 


Questions Reporters 
Can Ask 


Just because Dr. Famous or Dr. Bigshot says this is what he (bund doesn’t mean 
it is necessarily so. 

—Dr. Arnold Rrinun 


Ask to see the numbers, not just the pretty colors. 

—Dr. Richard Margolin 
hiotwnat Institute* oj Health, 
dcKTibing PKT team to reporters 


W, 


HAT questions should we reporters ask—to make our 
news solid, to report the more valid claims and ignore the weak 
and phony? When a scientist or physician or anyone else says, 
Tve discovered that . . . " what should we ask? 

In 1949, a year after Britain’s National Health Service— 
“socialized medicine*—was launched, my editors sent me to 
Britain to see how it was working. A bit stumped, I asked Dr. 
Morris Fishbein, the provocative genius who long edited the 
Journal of the American Medical Association, “How can I, a reporter, 
tell whether a doctor is doing a good job?* He immediately said, 
“Ask him how often he has a patient take off his shirt.* 

His lesson was plain: No physical examination is complete 
unless the patient takes off his or her clothes. Most reporters are 
not skilled statisticians, but we can ask some similarly revealing 
questions. Many of these are not even statistical, just simple 
ones that, like Fishbein’s, probe soft spots and often disclose 
either a conscientious approach or one that can’t be trusted. 
We can learn here from one method of science. We said 
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Why did you do it that way? Do you think it was the right kind of 
study to get the answer to this question or problem? 

Was it a true human experiment, if possible\ with comparable groups 
picked at random for comparison? If not, why not? And what was the 
substitute? 

If an investigator patiently—you hope—tells you about an 
acceptable-sounding design, that’s worth a brownie point. If the 
answer is "Huh?” or a nasty one, that may tell you something 
else. 

Are you presenting preliminary data or something fairly conclusive? 
Are you presenting a conclusion or a hypothesis for further study? “Pre¬ 
liminary” and “interesting* can mean “unproved.” 

If the result is not reasonably conclusive, should there be further studies 
and what kind? 

How many subjects , patients , cases , or people are you talking about? 
Are these numbers large enough, statistically rigorous enough, to get the 
answers you want? Was there an adequate number of patients to show a 
difference between treatments? Why are you calling a press conference to 
report on four patients? 

Small numbers can sometimes carry weight. And they may 
sometimes be the only ones possible. “Sometimes small samples 
are the best we can do," one researcher says. But larger numbers 
are always more likely to pass statistical muster. 

The number studied can also depend on the subject. A 
thorough physiological study of five cases of some difficult disor¬ 
der may be important. One new case of smallpox would be a 
shocker in a world in which smallpox has supposedly been elimi¬ 
nated. In June 1981 the federal Centers for Disease Control 
reported that five young men, all active homosexuals, had been 
treated for Pneumocystis carinii pneumonia at three Los Angeles 
hospitals. 1 This alerted the world to what soon became the 
AIDS epidemic. 

Who were your subjects? How were they selected? What were your 
criteria for admission to the study? Were rigorous laboratory tests used to 
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the treatment, (2) those getting it, and (3) those assessing the outcome know 
who was getting what, or were they indeed blinded, knowing only that they 
were comparing A and B (or A, B, and C, perhaps)? 

Could those giving or getting the treatment have easily guessed which 
was which by a difference in reaction or taste or other results? 

Not every study can be a blind study. One researcher says, 
TTliere can be ethical problems in not telling patients what drug 
they’re taking and the possible side effects. People are not guinea 
pigs.* True enough, but a Winded study will always carry more 
conviction. 

Wert there other accepted quality controls? For example, making 
sure (perhaps by counting pills or studying urine samples) that 
the patients supposed to take a pill really took it. 

Were you able to follow your protocol or study plan? 

If there were questionnaires, interviews, or a survey: Were 
the questions likely to elicit accurate, reliable answers? Was it really possible 
to gel accurate answers to these questions? 

Sampling is as common in medical studies as in political 
polling. Every study examines a sample, not the whole popula¬ 
tion. The sample must be reasonably accurate to give valid 
results. But badly worded questions can also distort the results. 
Respondents 9 answers can differ sharply, depending on how 
questions are asked. Example: In one study 1,153 subjects were 
asked which is safer, a treatment that kills 10 percent of every 
100 patients or a treatment with a 90 percent survival rate? 
More people voted for the second way of saying precisely the 
same thing.* 

People commonly give inaccurate answers to sensitive 
questions, such as those about sexual behavior. They are noto¬ 
riously inaccurate in reporting their own medical histories, even 
those of recent months. 

Ask: Did you pretest your questions for effectiveness bffore doing your 
actual survey? 

Also: What was your nonresponse rate? Do you report it? 
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Could your results have occurred just by chance? Have any statistical 
tests been applied to test this? 

Did you calculate a P value? Was it favorable—.05 or less? (Re¬ 
ported as < .05; see Chapter 3.) P values and confidence state¬ 
ments need not be regarded as straitjackets, but like jury ver¬ 
dicts, they indicate reasonable doubt or reasonable certainty. 

Remember that positive findings are more likely to be re¬ 
ported and published than negative findings. Remember that a 
favorable-sounding P value of < .05 means only that there is 
just 1 chance in 20, or a 5 percent probability, that the statistics 
could have come out this way by pure chance when there was 
actually no effect—so 1 in every 20 statistically significant results 
may be a misleading false positive. 

There are also ways and ways of arriving at P values. For 
example, an investigator may choose to report one of several end 
points: death, length of survival, blood pressure, other measure¬ 
ments, or just the patient’s condition on leaving the hospital. All 
can be important, but a P value can be misleading if the wrong 
one is picked or emphasized. 

You might want to ask: Are all the important end points and their 
P values reported? Also: Was the test giving the P value the appropriate 
test , as planned in your written protocol , or did you finally do more than 
one kind of test? (And perhaps report only the best answer?) What 
were the other values? 

Did you collaborate with a statistician in both your design and your 
analysis? A statistician’s collaboration often may be indicated in a 
credit or footnote. 

In studies seeking cause and effect, remember that association 
is not necessarily causation. Rutgers’ Dr. Michael Greenberg 
reminds us, “Mathematical methods cannot establish proof of 
cause and effect. They can indicate the probability that a rela¬ 
tionship occurred by chance, can sometimes quantify the exist¬ 
ing relationship between actions and effects, and can under the 
best circumstances be used to predict the impact erf* actions even 
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if the complex phenomena driving them are not understood. 

. . . View mathematical associations with a healthy degree of 
skepticism.* 

A true experiment, controlling all variables, can sometimes 
prove cause and effect almost surely. This is easier in physics 
and chemistry than in human biology. When, then, does a dose 
association in an observational study (rather than a controlled 
experiment) indicate causation? There are several possible crite¬ 
ria that you can ask about: 

Is the association consistent? Are similar results usually found in 
different places and by different research methods? 

How strong is the association? If risk is an appropriate way of 
describing a particular situation: What is the relative risk , or the risk 
ratio? The word “strong' 1 is used here in its mathematical sense. 
It mainly means the magnitude of an effect or risk, the odds favor¬ 
ing the outcome of interest versus no such outcome. 

A relative risk, or risk ratio, compares two rates by dividing 
one by the other. In an American Cancer Society smoking study 
(see page 46) the lung cancer mortality rate in nonsmokers aged 
55 to 69 was 19 per 100,000 per year; the risk in smokers was 
1 188 per 100,000. Since 188 divided by 19 equals 9.89, the 
smokers were about 9.9 times more likely to die from lung 
cancer—their relative risk was 9.9.‘ That’s strong! 

Is there an impressive dose-response, or cause-and-effect, curve—a 
curve or gradient that shows that the greater the exposure to the 
agent, or cause, the greater the effect? Heavy smokers are in¬ 
deed at greater risk than moderate smokers, and moderate 
smokers at greater risk than light smokers. (In some cases—this 
is an unsettled matter— there may be a threshold effect, an effect 
only after some minimum dose.) 

Another way of asking about risk and response: What is the 
correlation coefficient—ike. extent to which a set of measurements of 
the association is linear? A perfect linear relationship, or correla¬ 
tion, between two observations or variables would show up as a 
straight, steadily rising set of data points— in everyday language, 
a straight line on a graph. A perfect positive correlation or. 
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linear relationship, is given the value +1; +.5 would be a lesser 
but still interesting relationship; — 1 or any negative figure indi¬ 
cates an inverse or negative relationship , such as a runner's speed 
going down as his weight goes up. A correlation of zero means 
no consistent association. 

How specific is the association? Does a supposed cause lead to 
many supposed effects? Or does an effect depend on many sup¬ 
posed causes? Such associations are less specific, and thus more 
suspect, until positive evidence piles up. Smoking indeed causes 
many effects. A lung disease, asbestosis, is most common when 
there is exposure to both asbestos and cigarette smoke. 

Does the supposed cause precede the effect? Is a supposed biological 
association epidemiobgicaUy plausible? One strong argument for a 
cause-and-effect relationship between high consumption of satu¬ 
rated fats and cholesterol and coronary heart disease is that 
populations on such diets generally develop more such disease 
than those on leaner diets. 

Does the association make biological sense? Does it agree with 
current biological and physiological knowledge? \bu can’t follow 
this test out the window. Much biological fact is ill understood. 
Also, Mosteller warns, “Someone nearly always will claim to see a 
[biological or physiological] association. But the people who 
know the most may not be willing to." 7 

Finally, look for the real why. Ask: Are there other possible 
explanations? Did you look fin other explanations—amfbunders, or con¬ 
founding variables , that may be producing or helping produce the 
association? Sometimes we read that married people live longer 
than singles. Does marriage really increase life span, or may 
medical or other problems make some people less likely to 
marry and also die sooner? Maybe the Dutch thought storks 
brought babies because better-off families had more chimneys, 
more storks, and more babies. 

Did you take steps to control or adjust fin other possible explanations? 
Did you do a stratified analysis —a breakdown of the data by strata 
like sex, race, socioeconomic status, geographical area, occupa¬ 
tion? Men commonly have more bronchitis and cirrhosis of the 
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CHAPTER 


QUESTIONS RI 


point the same at onset? At diagnosis? At start oj treatment? Were they 
judged by the same disease definitions at the start and the same measures oj 
severity and outcome? 

Did the intervention have the good results that were intended? Has 
dure been an evaluation to see whether it was a useful result? 

Investigators often report that a drug or other measure has 
lowered blood cholesterol levels. Tine, but w tee they able to 
show that it reduced the number of heart attacks? Or was reduc¬ 
tion of a supposed risk factor itself taken to mean the hoped-for 
outcome? That may often be necessary, but the issue should be 
discussed. 

Investigators once reported that a new heart drug reduced 
the number of recurrent myocardial infarctions (heart attacks), 
fatal and nonfatal. But total mortality for all causes was higher 
in the treated group than in a placebo group. 

Public health officials may announce the success of a cam¬ 
paign to take high blood pressure measurements: X number of 
people were found to be hypertensive and were referred to their 
doctors. But how many went to their doctors? How many of 
those received optimum treatment? Were their blood pressures 
reduced? (If they were, the evidence is strong that they should 
suffer fewer strokes.) 

In short: What was the bottom tine? Did you reaUy do any good? 

75 whom do your results apply? Can they be generalized to a larger 
population? Are your patients tike the average doctor's patients? Is there any 
basis in these findings for any patient to ask his or her doctorfor a change in 
treatment? Clinic populations, hospital populations, and the 
“worst cases” are not necessarily typical of patients in general, 
and improper generalization is unfortunately common in the 
medical literature. 

Again and again, in many of the cases cited in this chapter, 
ask: Do other studies back you up? Art your results consistent with other 
clinical and experimental findings? Have your results been repeated or 
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CHAPTER 


QUESTION*, *£ 


own work’s importance. 1 ' But there are many exceptions. 

Ask others in the same field: How do other informed people 
regard this report—and these investigators? Are they speaking in their own 
arm of expertise, or have they shown real mastery if they have ventured 
outside it? Have their past results generally held up? And what are some 
good questions 1 can ask them? True, a lot of brilliant and original 
work has been pooh-poohed for a time by others. Still, scientists 
survive only by eventually convincing their colleagues. 

More formally: Has there been a review of the data and conclusions 
by any disinterested parties? Some major clinical studies are re¬ 
viewed by independent second parties or committees. Reports 
of the National Academy of Sciences must pass muster by a 
review committee. 

Has there been pm review of the material? That is, has it been 
examined by referees who were sent the article by a journal 
editor? 

And, a very important question: Has the work been published 
or accepted by a reputable journal? If not, why not? The New England 
Journal of Medicine prints only 15 percent of the papers submitted 
to it (many, of course, are rejected because they are not of 
enough interest to the journal's readers). Many have been given 
at medical or scientific meetings, yet do not pass peer reviewers’ 
or the editors’ muster. Most are eventually published elsewhere, 
many in good journals. But there are journals and journals. 

In science as a whole, including biology and often basic 
medical sciences, Science and the British Nature are indispensable. 
In general medicine and clinical science at the physician’s level, 
the best, most useful journals are probably New England Journal 
of Medicine, Journal of the American Medical Association, Armais of 
Internal Medicine, Canadian Medical Journal, Journal of Clinical Inves¬ 
tigation, and the British Lancet and British Medical Journal . There 
are many equally good specialty journals as well as mediocre 
ones. In epidemiology, three good sources are American Journal of 
Epidemiology, Journal of Chronic Diseases, and Preventive Medicine . 
Ask people in any field: What are the most reliable journals, 
those where you would want your work published? 
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QUESTIONS 


“Don’t assume that someone can interpret his own data. You 
may do better.” And “muddle around in the footnotes and ap¬ 
pendices” Mosteller advises. “You might find a few horrors. 
That’s how people found out that a much publicized study of 
public and private schools included only about 12 private, non- 
parochial schools.” 

• Other things described in this chapter, such as the proto¬ 
col and study design, the criteria for admitting and randomizing 
subjects, die therapy actually received (in contrast to that 
planned in the protocol), blinding, complications, loss to follow¬ 
up, follow-up time, and any discussion of reservations or 
weaknesses. 

Ask, when appropriate: Where did the money to support the study 
come from? Many honest investigators are financed by companies 
that may profit from the outcome. So are some dishonest or self- 
deluding investigators. But the peddler of a biased point of view 
is as likely to be an antiestablishment crusader—or an academic 
ladder-climber—as a corporate darling. Perhaps the best ques¬ 
tion to ask yourself is, Is this investigator a scientist or a sales¬ 
man? In any case, the public should know any pertinent con¬ 
nections. 

“What proportion of papers will satisfy [all] the require¬ 
ments for scientific proof and clinical applicability?” Sackett 
writes, “Not very many. . . . After all, there are only a handful 
of ways to do a study properly but a thousand ways to do it 
wrong!* 11 

Despite impeccable design, some studies yield answers that 
turn out to be wrong. Some fail for lack of understanding of 
physiology and disease. Even the soundest studies may provoke 
controversy. No study settles anything for all time. 

And according to Sackett, some “may meet considerable 
resistance when they discredit the only treatment currently 
available. . . . Clinicians may still elect to do something, even if 
it is of no demonstrable benefit. Study results may be rejected, 
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Tests and Testing 


6 


Testing is often the only way to answer our questions, but it doesn't produce 
unassailable, universal truths that should be carved on stone tablets. Instead, 
testing produces statistics, which must be interpreted. 

—Robert Hooke 


Who knows when thou mayest be tested? 


—Ronald Arthur Hopwood 


UO physicians always know what the/re doing when they 
administer tests? Stanford’s Dr. Eugene Robin says many tests 
“have not been properly evaluated and in fact may be useless or 
harmful” He asks, “Is it common practice in medicine to per¬ 
form careful clinical trials before introducing tests that can affect 
the welfare of masses of patients? Sadly, the answer is no." 1 

A good test should detect both health and disease and do so 
with high accuracy. The measures of the value of a clinical test, 
one used for medical diagnosis, are sensitivity and specificity , or, 
simply, the ability to avoid false negatives a nd false positives. Sensitiv¬ 
ity is how well a test identifies a disease or condition in those who 
have it—how well it avoids false negatives, or missed cases. If 100 
people with a condition are tested and 90 test positive, the test’s 
sensitivity is 90 percent. Specificity is how well a test identifies 
those who do not have the disease or condition—how well it rules 
out false positives, or mistaken identifications. If 100 healthy peo¬ 
ple are tested and 90 test negative, the test’s specificity is 90 
percent. 

Sensitivity, in short, tells us about disease present . Specificity tells 
us about disease absent !. A highly unspecific test will produce 
many false positives; a highly insensitive test, many false nega- 
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